Nightfighter001 Nightfighter001 - 6 months ago 9
Java Question

Replace Unicode escapes with the corresponding character

I'm trying to convert code points, such as

\u00FC
, to the character it represents.

import javax.swing.JOptionPane;

public class Test {
public static void main(String[] args) {
String in = JOptionPane.showInputDialog("Write something in here");
System.out.println("Input: " + in);
// Do something before this line
String out = in;
System.out.print("And Now: " + out);
}
}


An example to explain what I mean:

First Console line:
Input: Hall\u00F6


Second Console line:
And Now: Hallö


EDIT: Because sometimes it didn't work with multiple Unicodes in The Trombone Willy's answer, here is the Code fixed:

public static String unescapeUnicode(String s) {
StringBuilder r = new StringBuilder();
for (int i = 0; i < s.length(); i++) {
if (s.length() >= i + 6 && s.substring(i, i + 2).equals("\\u")) {
r.append(Character.toChars(Integer.parseInt(s.substring(i + 2, i + 6), 16)));
i += 5;
} else {
r.append(s.charAt(i));
}
}
return r.toString();
}

Answer

Joao's answer is probably the simplest, but this function can help when you don't want to have to download the apache jar, whether for space reasons, portability reasons, or you just don't want to mess with licenses or other Apache cruft. Also, since it doesn't have very much functionality, I think it should be faster. Here it is:

public static String unescapeUnicode(String s) {
    StringBuilder sb = new StringBuilder();

    int oldIndex = 0;

    for (int i = 0; i + 2 < s.length(); i++) {
        if (s.substring(i, i + 2).equals("\\u")) {
            sb.append(s.substring(oldIndex, i));
            int codePoint = Integer.parseInt(s.substring(i + 2, i + 6), 16);
            sb.append(Character.toChars(codePoint));

            i += 5;
            oldIndex = i;
        }
    }

    sb.append(s.substring(oldIndex + 1, s.length()));

    return sb.toString();
}

I hope this helps! (You don't have to give me credit for this, I give it to public domain)