Kayla Jin Kayla Jin - 4 months ago 17
JSON Question

python regex: escape illegal characters only if it's not escaped already

Right now I'm taking in files as string and formatting them into json. The file contains \r, \n characters that I want to escape, but there are json objects in the files which already has those illegal characters escaped, like \\r and \\n. So right now I want to replace \r \n characters with \\r \\n only if it's not preceded by one \

I have tried this below.. but I'm not sure why it doesn't work

re.sub(r'[^\][\n]', r'\\\\n', s)


any suggestion would be appreciated!

Answer

You could do it like this using RegEx:

import re                                    

data = 'Hello\r\nWorld\\r\\n'                
print(data)                                  

print('-'*20)                                

data = re.sub(r'([^\\])\r', '\\1\\\\r', data)
data = re.sub(r'([^\\])\n', '\\1\\\\n', data)
print(data) 

Outputs:

Hello               
World\r\n           
--------------------
Hello\r\nWorld\r\n  

Or, a slightly different approach would be to do it like this:

data = 'Hello\r\nWorld\\r\\n'                          
print(data)                                            

print('-'*20)                                          

data = data.replace('\r', '\\r')
data = data.replace('\n', '\\n')
print(data)  

Outputs:

Hello               
World\r\n           
--------------------
Hello\r\nWorld\r\n