howard howard - 2 months ago 10
C Question

fopen Segfault error on large files

Hello everyone I'm new to C but I've recently been getting a weird segfault error with my fopen.

FILE* thefile = fopen(argv[1],"r");


The problem I've been having is that this code works on other smaller text files, but when I try with a file around 400MB it will give a sefault error. I've even tried hardcoding the filename but that doesn't work either. Could there be a problem in the rest of the code causing the segfault on this line?(doubt it but would like to know if its possible. It's just really odd that no errors come up for a small text file, but a large text file does get errors.

Thanks!

EDIT* didn't want to bog this down with too much but heres my code

int main(int argc, char *argv[])
{
if(argc != 3)
{
printf("[ERROR] Invalid number of arguments. Please pass 2 arguments, input_bound_file (column 1:probe, columne 2,...: samples) and desired_output_file_name");
exit(2);
}

int i,j;
rankAvg= g_hash_table_new(g_direct_hash, g_direct_equal);
rankCnt= g_hash_table_new(g_direct_hash, g_direct_equal);
table = g_hash_table_new_full (g_direct_hash, g_direct_equal, NULL, g_free);
getCounts(argv[1]);
printf("NC=: %i nR =: %i",nC,nR);
double srcMat[nR][nC];
int rankMat[nR][nC];
double normMat[nR][nC];
int sorts[nR][nC];
char line[100];

FILE* thefile = fopen(argv[1],"r");
printf("%s\n", strerror(errno));
FILE* output = fopen(argv[2],"w");
char* rownames[100];
i=0;j = 1;
int processedProbeNumber = 0;
int previousStamp = 0;
fgets(line,sizeof(line),thefile); //read file

while(fgets(line,sizeof(line),thefile) != NULL)
{
cleanSpace(line); //creates only one space between entries
char dest[100];
int len = strlen(line);
for(i = 0; i < len; i++)
{
if(line[i] == ' ') //read in rownames
{
rownames[j] = strncpy(dest, line, i);
dest[i] = '\0';
break;
}
}

char* token = strtok(line, " ");
token = strtok(NULL, " ");
i=1;

while(token!=NULL) //put words into array
{
rankMat[j][i]= abs(atof(token));
srcMat[j][i] = abs(atof(token));
token = strtok(NULL, " ");
i++;
}

// set the first column as a row id
j++;
processedProbeNumber++;

if( (processedProbeNumber-previousStamp) >= 10000)
{
previousStamp = processedProbeNumber;
printf("\tnumber of loaded lines = %i",processedProbeNumber);

}
}
printf("\ttotal number of loaded lines = %i \n",processedProbeNumber);
fclose(thefile);

Answer

How do you know that fopen is seg faulting? If you're simply sprinkling printf in the code, there's a chance the standard output isn't sent to the console before the error occurs. Obviously, if you're using a debugger you will know exactly where the segfault occured.

Looking at your code, nR and nC aren't defined so I don't know how big rankMat and srcMat are, but two thoughts crossed my mind while looking at your code:

  • You don't check i and j to ensure that they don't exceed nR and nC
  • If nR and nC are sufficiently large, that may mean you're using a very large amount of memory on the stack (srcMat, rankMat, normMat, and sorts are all huge). I don't know what environemnt you're running in, but some systems my not be able to handle huge stacks (Linux, Windows, etc. should be fine, but I do a lot of embedded work). I normally allocate very large structures in the heap (using malloc).
Comments