Zereges Zereges - 2 months ago 18
C Question

Why do C fundamental types have identifiers with multiple keywords

Apart from obvious answer because, guys designed it that way, why does C/C++ have types, which consist of multiple identifiers, e.g.


  • long long (int)

  • short int

  • signed char



A do have some basic knowledge of parsing and have used flex/bison tools to make few parsers and I think, that this bring much more complexity to parsing type names. And looking on C++ grammar in standard, everything about types really is complicated.

I know, that C++ (also C, I believe) do not specify much about sizes of fundamental data types, thus making types
int_8
,
uint_8
, etc. would not work (Altough c++11 gave us fixed width integers).

So, why did developers of standard agreed on multi-word type identifiers, when they could make
int
,
uint
and similar.

Answer

Speaking in terms of C, why did the developers of the standard agree on multi-word identifiers? It's because that was what the language had at the time of standardisation.

The mandate for the original standard was not to create a new language but to codify existing practice. As per the C89 standard itself:

The Committee evaluated many proposals for additions, deletions, and changes to the base documents during its deliberations. A concerted effort was made to codify existing practice wherever unambiguous and consistent practice could be identified. However, where no consistent practice could be identified, the Committee worked to establish clear rules that were consistent with the overall flavor of the language.

And, from the C99 rationale document:

The original X3J11 charter clearly mandated codifying common existing practice, and the C89 Committee held fast to precedent wherever that was clear and unambiguous. The vast majority of the language defined by C89 was precisely the same as defined in Appendix A of the first edition of The C Programming Language by Brian Kernighan and Dennis Ritchie, and as was implemented in almost all C translators of the time.

Beyond that, each iteration of the standard has valued backward compatibility highly so that code doesn't break. From that same rationale document:

Existing code is important, existing implementations are not. A large body of C code exists of considerable commercial value. Every attempt has been made to ensure that the bulk of this code will be acceptable to any implementation conforming to the Standard. The C89 Committee did not want to force most programmers to modify their C programs just to have them accepted by a conforming translator.

So, while later versions of the standard gave us things like stdint.h with its fixed width integral types, taking away the standard ones like int and long would be a gross violation of that guideline.


In terms of C++, it's almost certainly a holdover from the earliest days of that language where it was put forward as "C plus classes". In fact, the very early cfront C++ compiler was so named because it took C++ source code and turned that into C before giving it to a suitable C compiler (i.e., a front end for C, hence cfront).

This would have allowed the original author Bjarne to minimise the workload in delivering C++ since the bulk of it was already provided by the C compiler itself.


In terms of parsing a language, it's certainly more difficult to have to process unsigned long int x (a) than it is to handle ulong x.

But, given that the compiler already has to handle a large number of optional "modifiers/specifiers" for a variable (e.g., const char * const x), handling a few others is par for the course.


(a) Or int long unsigned x or long unsigned x or any of the other type specifiers that end up becoming the singular unsigned long int type. See here for more details.