while ((c = getchar()) != EOF)
"This value is called EOF, for "end of file". We must declare c to be
a type big enough to hold EOF in addition to any possible char.
Therefore we use int."
It has been explained in other answers before, but sometimes it is harder to find the duplicate than to give the answer.
char type can be signed or unsigned.
getchar() returns either EOF or …obtains that character as an
char converted to an
int… (quoting the standard for
fgetc(), but it applies to
If you have an unsigned plain
char type, then the assignment will generate a value 0..255 which will then be promoted to
int for the comparison with EOF, and since none of the values 0..255 is negative, the test will always fail — and the loop won't stop until you terminate the program by some other means (interrupt, reboot, …).
If you have a signed plain
char type, then the assignment will treat both one valid character (often ÿ — U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS, if you are using a single-byte code set such as ISO 8859-15) and EOF as marking EOF, so the loop may terminate prematurely on some files.
So, depending on the machine, the loop:
char c; while ((c = getchar()) != EOF) ;
may either be an infinite loop or it may terminate before EOF for some data files. Neither is correct behaviour — and neither behaviour is a crash. (The code in the question won't crash.) Changing the type of
int fixes both problems reliably and portably.
Note that if you are working with a UTF-8 locale, you will not generate the hex 0xFF byte; that is not a valid byte in UTF-8 (U+00FF is encoded as two bytes 0xC3 0xBF in UTF-8).