David Andvett David Andvett - 1 month ago 10
C Question

how to do a word count from getline?

So i am trying to get the word count from the getline function, but i keep getting a segmentation fault error. Here, you can assume that white space will only be defined as '\t', '\n', and ' '.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int tokenCount(char *mystring){
int word=0;
char *ptr = mystring;
int i;

for(i=0; i<strlen(mystring);i++){

if(ptr[i]!=' ' || ptr[i]!= '\t' || ptr[i]!='\n'){
word++;

while(ptr[i]!= ' ' || ptr[i]!= '\t' || ptr[i] != '\n'){

i++;
}
}

}


return word;
}

int main (){

size_t n = 10;
char *mystring = malloc(10);

if(mystring==NULL){
fprintf(stderr, "No memory\n");
exit(1);
}

while(getline(&mystring, &n, stdin)>0){


printf("%d\n", tokenCount(mystring));
}

return 0;
}

Answer
while(ptr[i]!= ' ' || ptr[i]!= '\t' || ptr[i] != '\n'){

So, in English, while the value at i is not a space character, or the value at i is not a tab character, or the value at i is not a newline. See the problem? If ptr[i] is 'a', then it passes this test because it's not a space (good). But if it's as ' ' (space char), it still passes, because while it's equal to ' ', it's not equal to '\t', so the loop continues (bad). This is an infinite loop, and since it increments i, you run off the end of the array the pointer references into unallocated memory and crash.

Fix the test to use &&, not ||, and make sure you haven't reached the end of the string before performing it (also, cache the strlen at the beginning, don't recompute over and over):

size_t mystringlen = strlen(mystring);

...

if (ptr[i]!= ' ' && ptr[i]!= '\t' && ptr[i] != '\n') {
    ++word;
    while(i < mystringlen && ptr[i]!= ' ' && ptr[i]!= '\t' && ptr[i] != '\n'){

...

With a slight logic change (catches more whitespace characters) this could be simplified with isspace:

if (!isspace(ptr[i])) {
    ++word;
    while(i < mystringlen && !isspace(ptr[i])){
Comments