styfle styfle - 29 days ago 10
Javascript Question

How do I decode a string with escaped unicode?

I'm not sure what this is called so I'm having trouble searching for it. How can I decode a string with unicode from

http\u00253A\u00252F\u00252Fexample.com
to
http://example.com
with JavaScript? I tried
unescape
,
decodeURI
, and
decodeURIComponent
so I guess the only thing left is string replace.

EDIT: The string is not typed, but rather a substring from another piece of code. So to solve the problem you have to start with something like this:

var s = 'http\\u00253A\\u00252F\\u00252Fexample.com';


I hope that shows why unescape() doesn't work.

Answer

This is a unicode, escaped string. First the string was escaped, then encoded with unicode. To convert back to normal:

var x = "http\\u00253A\\u00252F\\u00252Fexample.com";
var r = /\\u([\d\w]{4})/gi;
x = x.replace(r, function (match, grp) {
    return String.fromCharCode(parseInt(grp, 16)); } );
x = unescape(x);
console.log(x);

To explain: I use a regular expression to look for \u00253A. However, since I need only a part of this string for my replace operation, I use parentheses to isolate the part I'm going to reuse, 253A. This isolated part is called a group.

The gi part at the end of the expression denotes it should match all instances in the string, not just the first one, and that the matching should be case insensitive. This might look unnecessary given the example, but it adds versatility.

Now, to convert from one string to the next, I need to execute some steps on each group of each match, and I can't do that by simply transforming the string. Helpfully, the String.replace operation can accept a function, which will be executed for each match. The return of that function will replace the match itself in the string.

I use the second parameter this function accepts, which is the group I need to use, and transform it to the equivalent utf-8 sequence, then use the built - in unescape function to decode the string to its proper form.

Comments