Mark Mark - 1 month ago 10
C# Question

String Comparison, .NET and non breaking space

I have an app written in C# that does a lot of string comparison. The strings are pulled in from a variety of sources (including user input) and are then compared. However I'm running into problems when comparing space '32' to non-breaking space '160'. To the user they look the same and so they expect a match. But when the app does the compare, there is no match.

What is the best way to go about this? Am I going to have to go to all parts of the code that do a string compare and manually normalize non-breaking spaces to spaces? Does .NET offer anything to help with that? (I've tried all the compare options but none seem to help.)

It has been suggested that I normalize the strings upon receipt and then let the string compare method simply compare the normalized strings. I'm not sure it would be straight-forward to do that because what is a normalized string in the first place. What do I normalize it too? Sure, for now I can convert non-breaking spaces to breaking spaces. But what else can show up? Can there potentially be very many of these rules? Might they even be conflicting. (In one case I want to use a rule and in another I don't.)

Answer

If it were me, I would 'normalize' the strings as I 'pulled them in'; probably with a string.Replace(). Then you won't need to change your comparisons anywhere else.

Edit: Mark, that's a tough one. Its really up to you, or you clients, as to what is a 'normalized' string. I've been in a similar situation where the customer demanded that strings like:

I have 4 apples.
I have four apples.

were actually equal. You may need separate normalizers for different situations. Either way, I would still do the normalization upon retrieval of the original strings.