Vinay thallam Vinay thallam - 5 months ago 20
Java Question

Replacing double backslashes with single backslash

I have a string "\\u003c", which belongs to UTF-8 charset. I am unable to decode it to unicode because of the presence of double backslashes. How do i get "\u003c" from "\\u003c"? I am using java.

I tried with,

myString.replace("\\\\", "\\");


but could not achieve what i wanted.

This is my code,

String myString = FileUtils.readFileToString(file);
String a = myString.replace("\\\\", "\\");
byte[] utf8 = a.getBytes();

// Convert from UTF-8 to Unicode
a = new String(utf8, "UTF-8");
System.out.println("Converted string is:"+a);


and content of the file is


\u003c

Answer

Not sure if you're still looking for a solution to your problem (since you have an accepted answer) but I will still add my answer as a possible solution to the stated problem:

String str = "\\u003c";
Matcher m = Pattern.compile("(?i)\\\\u([\\da-f]{4})").matcher(str);
if (m.find()) {
    String a = String.valueOf((char) Integer.parseInt(m.group(1), 16));
    System.out.printf("Unicode String is: [%s]%n", a);
}

OUTPUT:

Unicode String is: [<]

Here is online demo of the above code