Submersed24 Submersed24 - 22 days ago 4
C Question

Program will not run - letter frequency of book calculator

I am trying to find the word frequency and I cannot figure out why the program keeps crashing.. I have tried so many ways, yet I am still getting the same crash. Am I doing something wrong? *The input is a book file and a new file I want to create.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main( int argc, char *argv[] )
{
FILE *fp,*fp2;
int ch, total, counter, totalcounter, i;
int letters[25], letterfrequency[25];
for(i=0; i<25; i++)
{
letters[i] = 0;
letterfrequency[i] = 0;
}
printf("Opening: %s", argv[1]);
fp = fopen(argv[1], "r");
if (!fp)
{
perror("fopen");
exit(1);
}
while((ch=fgetc(fp)) != EOF)
{
if((ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z'))
{
counter = toupper(ch);
ch = counter - 65;
letters[ch]++;
totalcounter++;
}
}

fp2 = fopen(argv[2], "w");
for(i=0;i<25;i++)
{
fprintf(fp2, "%c: Times used: %s\tFrequency Used: %s", i+65, letters[i], letters[i]/totalcounter);
}
fclose(fp);
fclose(fp2);
return 0;
}

Answer

Your code has a number of problems. First is that it has warnings. These warnings are pointing at problems. Unfortunately most C compilers won't show you warnings by default. You have to turn them on with -Wall. But -Wall doesn't mean "all warnings", oh no. If you ask why, the short answer is that C is an accumulation of decades of questionable design choices, get used to it. :-/ I run with even more warnings and checks: -Wall -Wwrite-strings -Wextra -Wconversion -std=c99 -pedantic -g.

There's a bunch of unused and uninitialized variables, your printf specification is wrong, and you're missing an include for toupper. I'll leave you to fix all that.


Then the next problem is you've initialize letters to be 25 long, but the alphabet has 26 characters. Fortunately you're also only iterating over them 25 times, but that means you'll lose 'Z'. This is an easy mistake to make. A 26 item long array goes from 0 to 25, but its length is 26.

Rather than repeat the length of the array all over the place, and probably miss one, it's better to define it in one place.

#define NUM_LETTERS 26

Then there's a faster and simpler way to initialize an array.

int letters[NUM_LETTERS] = {0};

There's no need to specify every element, C will fill in the rest with the last element.


Since it's totally safe to call toupper on something that's not a character, it just returns the character unchanged, you can simplify your while loop.

while((ch=fgetc(fp)) != EOF) {
    ch = toupper(ch);
    if( 'A' <= ch && ch <= 'Z' ) {
        ch -= 65;
        letters[ch]++;
        totalcounter++;
    }
}

Note the style I used, 'A' <= ch && ch <= 'Z'. This makes it easier to see that it's a check for ch inside the range from A to Z.


The next problem is this: letters[i]/totalcounter.

In C, if you divide two integers you get an integer. That means 20/100 is 0. If you want a decimal, you have to cast one of the variables involved to a floating point type: (double)letters[i]/totalcounter).


There's the possibility that no letters were read from the file and totalcounter is 0. If so letters[i]/totalcounter will cause a divide-by-zero error. So you have to check for that case.

if( totalcounter != 0 ) {
    for( i=0; i<NUM_LETTERS; i++ ) {
        printf("%c: Times used: %d\tFrequency Used: %f\n",
               i+65, letters[i], (double)letters[i]/totalcounter
        );
    }
}
else {
    printf("No letters found.\n");
}

You fail to check if the user gave you a filename argument. If they don't, your program will crash. It's important to add a usage check.

if( argc < 2 ) {
    fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
    exit(1);
}

And note that I'm not writing the results to a file, and I have my version only taking one file, the one to read. My version outputs to stdout.

In general, it's better to print program results to stdout than to a file. This allows the program to work well with shell piping which makes it much more flexible.

./wordcount somefile                      # output to the screen
./wordcount somefile.txt > somefile.count # output to a file
./wordcount somefile.txt | program        # output to another program