Java Question

Remove non-ASCII characters from String in Java

I have a URI that contains strange characters like :


How can I remove "�" from this URI


I'm guessing that the source of the URL is more at fault. Perhaps you're fixing the wrong problem? Removing "strange" characters from a URI might give it an entirely different meaning.

With that said, you may be able to remove all of the non-ASCII characters with a simple string replacement:

string fixed = original.replaceAll("[^\\x20-\\x7e]", "");

Or you can extend that to all non-four-byte-UTF-8 characters if that doesn't cover the "�" character:

string fixed = original.replaceAll("[^\\u0000-\\uFFFF]", "");