char c = 0xff;
bool b = 0xff == c;
// Under most C/C++ compilers' default options, b is FALSE!!!
Historical reasons, mostly.
Expressions of type
char are promoted to
int in most contexts (because a lot of CPUs don't have 8-bit arithmetic operations). On some systems, sign extension is the most efficient way to do this, which argues for making plain
On the other hand, the EBCDIC character set has basic characters with the high-order bit set (i.e., characters with values of 128 or greater); on EBCDIC platforms,
char pretty much has to be unsigned.
The ANSI C Rationale (for the 1989 standard) doesn't have a lot to say on the subject; section 18.104.22.168 says:
Three types of char are specified:
signed, plain, and
unsigned. A plain
charmay be represented as either signed or unsigned, depending upon the implementation, as in prior practice. The type
signed charwas introduced to make available a one-byte signed integer type on those systems which implement plain char as unsigned. For reasons of symmetry, the keyword
signedis allowed as part of the type name of other integral types.
Going back even further, an early version of the C Reference Manual from 1975 says:
charobject may be used anywhere an
intmay be. In all cases the
charis converted to an
intby propagating its sign through the upper 8 bits of the resultant integer. This is consistent with the two’s complement representation used for both characters and integers. (However, the sign-propagation feature disappears in other implementations.)
This description is more implementation-specific than what we see in later documents, but it does acknowledge that
char may be either signed or unsigned. On the "other implementations" on which "the sign-propagation disappears", the promotion of a
char object to
int would have zero-extended the 8-bit representation, essentially treating it as an 8-bit unsigned quantity. (The language didn't yet have the
C's immediate predecessor was a language called B. B was a typeless language, so the question of
char being signed or unsigned did not apply. For more information about the early history of C, see the late Dennis Ritchie's
home page, now moved here.
As for what's happening in your code (applying modern C rules):
char c = 0xff; bool b = 0xff == c;
char is unsigned, then the initialization of
c sets it to
(char)0xff, which compares equal to
0xff in the second line. But if plain
char is signed, then
0xff (an expression of type
int) is converted to
char -- but since
0xff exceeds CHAR_MAX (assuming
CHAR_BIT==8), the result is implementation-defined. In most implementations, the result is
-1. In the comparison
0xff == c, both operands are converted to
int, making it equivalent to
0xff == -1, or
255 == -1, which is of course false.
Another important thing to note is that
signed char, and (plain)
char are three distinct types.
char has the same representation as either
unsigned char or
signed char; it's implementation-defined which one it is. (On the other hand,
signed int and
int are two names for the same type;
unsigned int is a distinct type. (Except that, just to add to the frivolity, it's implementation-defined whether a bit field declared as plain
int is signed or unsigned.))
Yes, it's all a bit of a mess, and I'm sure it would have be defined differently if C were being designed from scratch today. But each revision of the C language has had to avoid breaking (too much) existing code, and to a lesser extent existing implementations.