mkHun mkHun - 1 month ago 18
Perl Question

\x not working inside the substitution

I'm trying to decode the unicode characters. So I simply tried the hexadecimal escape sequence

\x{}
inside the regex substitution
e


use LWP::Simple;
my $k = get("url");

my ($kv) =map{/js_call\(\\"(.+?)\\"\)/} $k;

#now $kv data is https://someurl/call.pl?id=15967737\u0026locale=en-GB\u0026mkhun=ccce

$kv=~s/\\u(.{4})/"\x{$1}"/eg;


I'm trying substitute the all unicode character.

My expected output is:

https://someurl/call.pl?id=15967737&locale=en-GB&mkhun=ccce


Below mentioned
print
statement gives the expected output. However the regex seems doesn't working properly.

print "\x{0026}";

Answer

The problem with s/\\u(.{4})/"\x{$1}"/e is that the backslash escape \x{$1} is evaluated at compile time, which gives a NULL byte:

$ perl -E 'printf "%vX\n", "\x{$1}"'
0

If we escape the backslash in front of x ( s/\\u(.{4})/"\\x{$1}"/ge ) we get a string with literal escape sequences, but still not the desired unicode character:

use feature qw(say);
$kv = '\u0026';
$kv =~ s/\\u(.{4})/"\\x{$1}"/ge;
say $kv; 

The output is now:

\x{0026}

However, the following seems to work:

$kv =~ s/\\u(.{4})/chr hex $1/ge;

or alternatively, using a double ee:

$kv =~ s/\\u(.{4})/"\"\\x{$1}\""/gee;