Abel Morelos Abel Morelos - 3 months ago 11
SQL Question

Escaping special characters for JSON output

I have a column that contains data that I want to escape in order to use it as JSON output, to be more precise am trying to escape the same characters listed here but using Oracle 11g: Special Characters and JSON Escaping Rules

I think it can be solved using REGEXP_REPLACE:

SELECT REGEXP_REPLACE(my_column, '("|\\|/)|(' || CHR(9) || ')', '\\\1') FROM my_table;


But I am lost about replacing the other characters (tab, new line, backspace, etc), in the previous example I know that \1 will match and replace the first group but I am not sure how to capture the tab in the second group and then replace it with \t. Somebody could give me a hint about how to do the replacement?

I know I can do this:

SELECT REGEXP_REPLACE( REGEXP_REPLACE(my_column, '("|\\|/)', '\\\1'), '(' || CHR(9) || ')', '\t')
FROM my_table;


But I would have to nest like 5 calls to REGEXP_REPLACE, and I suspect I should be able to do it in just one or two calls.

I am aware about other packages or libraries for JSON but I think this case is simple enough that it can be solved with the functions that Oracle offers out-of-the-box.

Thank you.

Answer

Here's a start. Replacing all the regular characters is easy enough, it's the control characters that will be tricky. This method uses a group consisting of a character class that contains the characters you want to add the backslash in front of. Note that characters inside of the class do not need to be escaped. The argument to REGEXP_REPLACE of 1 means start at the first position and the 0 means to replace all occurrences found in the source string.

SELECT REGEXP_REPLACE('t/h"is"'||chr(9)||'is a|te\st', '([/\|"])', '\\\1', 1, 0) FROM dual;

Replacing the TAB and a carriage return is easy enough by wrapping the above in REPLACE calls, but it stinks to have to do this for each control character. Thus, I'm afraid my answer isn't really a full answer for you, it only helps you with the regular characters a bit:

SQL> SELECT REPLACE(REPLACE(REGEXP_REPLACE('t/h"is"'||chr(9)||'is
  2  a|te\st', '([/\|"])', '\\\1', 1, 0), chr(9), '\t'), chr(10), '\n') fixe
  3  FROM dual;

FIXED
-------------------------
t\/h\"is\"\tis\na\|te\\st

SQL>

EDIT: Here's a solution! I don't claim to understand it fully, but basically it creates a translation table that joins to your string (in the inp_str table). The connect by, level traverses the length of the string and replaces characters where there is a match in the translation table. I modified a solution found here: http://database.developer-works.com/article/14901746/Replace+%28translate%29+one+char+to+many that really doesn't have a great explanation. Hopefully someone here will chime in and explain this fully.

SQL> with trans_tbl(ch_frm, str_to) as (
     select '"',     '\"' from dual union
     select '/',     '\/' from dual union
     select '\',     '\\' from dual union
     select chr(8),  '\b' from dual union -- BS
     select chr(12), '\f' from dual union -- FF
     select chr(10), '\n' from dual union -- NL
     select chr(13), '\r' from dual union -- CR
     select chr(9),  '\t' from dual       -- HT
   ),
   inp_str as (
     select 'No' || chr(12) || 'w is ' || chr(9) || 'the "time" for /all go\od men to '||
     chr(8)||'com' || chr(10) || 'e to the aid of their ' || chr(13) || 'country' txt from dual
   )
   select max(replace(sys_connect_by_path(ch,'`'),'`')) as txt
   from (
   select lvl
    ,decode(str_to,null,substr(txt, lvl, 1),str_to) as ch
    from inp_str cross join (select level lvl from inp_str connect by level <= length(txt))
    left outer join trans_tbl on (ch_frm = substr(txt, lvl, 1))
    )
    connect by lvl = prior lvl+1
    start with lvl = 1;

TXT
------------------------------------------------------------------------------------------
No\fw is \tthe \"time\" for \/all go\\od men to \bcom\ne to the aid of their \rcountry

SQL>

EDIT 8/10/2016 - Make it a function for encapsulation and reusability so you could use it for multiple columns at once:

create or replace function esc_json(string_in varchar2)
return varchar2
is 
s_converted varchar2(4000);
BEGIN
with trans_tbl(ch_frm, str_to) as (
     select '"',     '\"' from dual union
     select '/',     '\/' from dual union
     select '\',     '\\' from dual union
     select chr(8),  '\b' from dual union -- BS
     select chr(12), '\f' from dual union -- FF
     select chr(10), '\n' from dual union -- NL
     select chr(13), '\r' from dual union -- CR
     select chr(9),  '\t' from dual       -- HT
   ),
   inp_str(txt) as (
     select string_in from dual
   )
   select max(replace(sys_connect_by_path(ch,'`'),'`')) as c_text
   into s_converted   
   from (
   select lvl
    ,decode(str_to,null,substr(txt, lvl, 1),str_to) as ch
    from inp_str cross join (select level lvl from inp_str connect by level <= length(txt))
    left outer join trans_tbl on (ch_frm = substr(txt, lvl, 1))
    )
    connect by lvl = prior lvl+1
    start with lvl = 1;

    return s_converted;
end esc_json;

Example to call for multiple columns at once:

select esc_json(column_1), esc_json(column_2)
from your_table;