Does the C standard require that compilers be able to deal with files not encoded as ascii? Specifially, I am wondering whether utf-8 files are standards compliant. Does the answer to the previous question differ between C89, C99 and C11?
Assuming that it is legal to use characters from outside of ASCII in C source files, which usages are legal?
I can think of a few distinct use cases:
// Print out the © notice
cont char my©Notice = "This program is © 2016 ACME INC";
It's implementation defined, and thus not regulated by the standard.
I know of at least one compiler, namely
clang, that requires the source to be UTF-8. But other compilers might use other requirements, or not allow it.
Since C99, identifiers are allowed to contain multi-byte characters, but before C99 it would be an extension to allow non-basic characters there. C11 expanded the set of allowed characters.
There's some additional restrictions on what characters are allowed in identifiers, and © is not in the list. It's listed in appendix D. These are Unicode points, but that doesn't strictly mean the encoding in the file has to be unicode-based.
Ranges of characters allowed
Ranges of characters disallowed initially