Carps Carps - 6 months ago 29
Python Question

How to decode JSON and faithfully represent backslashes in Python using the json library

I have a config file which I'm trying to pull apart and then reassemble with updated sections. The config file is in a json format, and I'm trying to extract components of it out to update before inserting back into another json file.

The problem I'm finding is that sections of the JSON file use

"\/"
which when decoded using the JSON library for python I get
"/"
coming out. I need to faithfully represent the original JSON once I insert the updated values back into the new JSON file, hence I need the missing
"\/"
.

I suspect the
\
is getting interpreted as an escape and being dropped by the JSON decoder.

Below is a sample of my efforts so far:

JSON string example:

{"Markup\/0.xaml":"text\/xml; charset=utf-8; format=xml; clrtype=ESRI.ArcGIS.Client.Graphic","Markup\/1.xaml":"text\/xml; charset=utf-8; format=xml; clrtype=ESRI.ArcGIS.Client.Graphic"}


Python Code:

import json
with open(full_json_path_old, 'r+') as fo:
data = json.load(fo)
print "DECODED STRING - ", data


Result of the Print:

u'Markup/0.xaml': u'text/xml; charset=utf-8; format=xml; clrtype=ESRI.ArcGIS.Client.Graphic',
u'Markup/1.xaml': u'text/xml; charset=utf-8; format=xml; clrtype=ESRI.ArcGIS.Client.Graphic'

Answer

Yes, the \ backslash is an escape character, and a proper JSON decoder will honour such a character as an escape. The Python JSON decoder is no exception. See section 7 of RFC 7159:

Any character may be escaped.

and

char = unescaped /
    escape (
        %x22 /          ; "    quotation mark  U+0022
        %x5C /          ; \    reverse solidus U+005C
        %x2F /          ; /    solidus         U+002F
        %x62 /          ; b    backspace       U+0008
        %x66 /          ; f    form feed       U+000C
        %x6E /          ; n    line feed       U+000A
        %x72 /          ; r    carriage return U+000D
        %x74 /          ; t    tab             U+0009
        %x75 4HEXDIG )  ; uXXXX                U+XXXX

escape = %x5C              ; \

(so the \/ sequence is the escape solidus sequence).

Your output is correct; in that I would expect mimetype components like text and xml to be delimited by a forward slash, not by \/.

A forward slash (solidus in the standard) does not have to be escaped however. The same section 7 states what characters must be escaped:

All Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

As such, the Python JSON encoder won't escape a forward slash when producing JSON output.