Cero Cero - 21 days ago 10
C Question

Segmentation Error in C program when reading text file

I want to print a bunch of words with their definitions in this format (word:defn) from a text file. However, I experience a segmentation error when running the program using gcc on a server. The weird thing is that when I compile the C program on a local desktop the program works perfectly.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int read_dict() {
FILE *fp;
int c;
char word[50];
char defn[500];
int sep = 0;
int doublenew = 0;
int i = 0;

fp = fopen("textfile.txt", "r");
if (fp == NULL) {
perror("Error in opening file");
return (-1);
}

while ((c = fgetc(fp)) != EOF) {
if (feof(fp)) {
break;
}
if (c == '.' && sep == 0) {
sep = 1;
word[i] = '\0';
//c = fgetc(fp);
i = 0;
} else
if (doublenew == 1 && c == '\n' && sep == 1) {
defn[i] = c;
i++;
defn[i] = '\0';
printf("%s %s", word, defn);
i = 0;
sep = 0;
doublenew = 0;
} else
if (c == '\n' && sep == 1) {
defn[i] = c;
doublenew = 1;
i++;
} else
if (sep == 0) {
word[i] = c;
i++;
} else
if (sep == 1) {
defn[i] = c;
i++;
doublenew = 0;
}
}
fclose(fp);
return 0;
}


text file:


COOKIE. is a small, flat, sweet, baked good, usually containing flour, eggs, sugar, and either butter, cooking oil or another oil or fat. It may include other ingredients such as raisins, oats, chocolate chips or nuts.

ICE CREAM. is a sweetened frozen food typically eaten as a snack or dessert.

Answer

The word length is limited to 49 characters and the definition to 499 characters, but you never check for overflow in your code. If, unlike your sample, the dictionary used on the server has longer words and/or definitions, your code invokes undefined behavior which can cause a segmentation fault.

Undefined behavior might also not cause any visible effect, as seems to be the case on your local machine. The code generated by the local compiler and that of the server may be different, due to a different version or different command line options.

Check for array boundaries to avoid this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int read_dict() {
    FILE *fp;
    int c;
    char word[50];
    char defn[500];
    int sep = 0;
    int doublenew = 0;
    size_t i = 0;

    fp = fopen("textfile.txt", "r");
    if (fp == NULL) {
        perror("Error in opening file");
        return (-1);
    }

    while ((c = fgetc(fp)) != EOF) {
        if (feof(fp)) {
            break;
        }
        if (c == '\r') {
            /* ignore CR characters inserted by Windows before LF */
            continue;
        }
        if (c == '.' && sep == 0) {
            sep = 1;
            word[i] = '\0';
            //c = fgetc(fp);
            i = 0;
        } else
        if (doublenew == 1 && c == '\n' && sep == 1) {
            if (i < sizeof(defn) - 1) {
                defn[i] = c;
                i++;
            }
            defn[i] = '\0';
            printf("%s %s", word, defn);
            i = 0;
            sep = 0;
            doublenew = 0;
        } else
        if (c == '\n' && sep == 1) {
            if (i < sizeof(defn) - 1) {
                defn[i] = c;
                i++;
            }
            doublenew = 1;
        } else
        if (sep == 0) {
            if (i < sizeof(word) - 1) {
                word[i] = c;
                i++;
            }
        } else
        if (sep == 1) {
            if (i < sizeof(defn) - 1) {
                defn[i] = c;
                i++;
            }
            doublenew = 0;
        }
    }
    fclose(fp);
    return 0;
}

Note: if nothing gets printed on the server, it means the file does not have 2 consecutive newline characters '\n'. If you are using the same file on your system and on the server, and if you are using Windows on your system and Linux on your server, the behavior of your program will be different on the '\r' characters used by Windows for the end of line. You must ignore these characters explicitly as they are only implicitly ignored on Windows, not on Linux. I modified the code above to account for this.

Comments