In .NET why isn't it true that:
Character encodings (UTF8, specificly) may have different forms for the same code point.
So when you convert to a string and back, the actual bytes may represent a different (canonical) form.
Some Unicode sequences are considered equivalent because they represent the same character. For example, the following are considered equivalent because any of these can be used to represent "ắ":
"\u1EAF" "\u0103\u0301" "\u0061\u0306\u0301"
However, ordinal, that is, binary, comparisons consider these sequences different because they contain different Unicode code values. Before performing ordinal comparisons, applications must normalize these strings to decompose them into their basic components.
That page comes with a nice sample that shows you what encodings are always normalized