Michael Wiles Michael Wiles - 4 months ago 17
Java Question

Convert from encoded unicode String into Java String

I have a string in json data which looks like this:

#0023Sat Apr 30 10:46:11 UTC 2016#000a[Interoperability]Interoperability#005c Index=Unknown (R03)#000a[Exif]Shutter#005c Speed#005c Value=1/1999 sec#000a[Exif]Bits#005c Per#005c Sample=8 8 8 bits/component/pixel#000a[Exif]Exposure#005c Bias#005c Value=0 EV#000a[Exif]Sub-Sec#005c Time#005c Original=00#000a


All those #XXXX words are unicode.

How do I convert this into a Java String?

Answer
Pattern p = Pattern.compile("#([0-9A-Fa-f]{4})");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while (m.find()) {
    int c = Integer.parseInt(m.group(1), 16);
    m.appendReplacement(sb, String.valueOf((char) c));
}
m.appendTail(sb);
return sb.toString();

This assumes that #XXXX encodes a UTF-16 Unicode code point. Unicode code points actually supercede the 16 bit range of #XXXX.

Comments