SRJ SRJ - 2 months ago 14
C# Question

Is it possible to read ASCII control characters in XML?

I am newly on XML and I need to know,

Is it possible to read ASCII control characters in XML? or

Is it possible to replace ASCII control characters in XML?

Answer

XML 1.1 allows for all Unicode characters other than U+0000, but XML 1.0 has a restricted character set. From section 2.2 of the 5th edition:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

That's the underlying character set - you can't use CharRef etc to create them.

Unfortunately, XML 1.0 is basically what's in use everywhere; XML 1.1 never really took off. That means you shouldn't try to produce XML documents containing the ASCII control characters - they won't be valid XML documents, although lots of XML APIs will unfortunately let you create them anyway :(

Basically, you should remove the control characters before you pass your data to whichever XML API you're using. If you need to preserve them, you'll need to either create your own escaping, or something similar (e.g. UTF-8-encode the whole text, then represent that in base64... all quite nasty).