Nanan Nanan - 1 month ago 13
JSON Question

Replace single quotes in double quotes in brackets

I must modify a file json. I must replace the single quotes in double quotes but I can't use the following command

sed -i -r "s/'/\"/g" file
because in the file there are more single quotes that I don't change.

The following code is an example of string:

"categories": [['Clothing, Shoes & Jewelry', 'Girls'], ['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'More Accessories', 'Kids & Baby']]


The desided result should be:

"categories": [["Clothing, Shoes & Jewelry", "Girls"], ["Clothing, Shoes & Jewelry", "Novelty, Costumes & More", "Costumes & Accessories", "More Accessories", "Kids & Baby"]]


sample file:

{"categories": [['Movies & TV', 'Movies']], "title": "Understanding Seizures and Epilepsy DVD"},
{"title": "Who on Earth is Tom Baker?", "salesRank": {"Books": 3843450}, "categories": [['Books']]},
{"categories": [['Clothing, Shoes & Jewelry', 'Girls'], ['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'More Accessories', 'Kids & Baby']], "description": "description, "title": "Mog's Kittens", "salesRank": {"Books": 1760368}}},
{"description": "Three Dr. Suess' Puzzles", "brand": "Dr. Seuss", "categories": [['Toys & Games', 'Puzzles', 'Jigsaw Puzzles']]},


I used a regular expression but the problem is that I don't know how many element are in brackets. So I would a way for replace all single quotes in the brackets, this is a perfect way, but I can not find the solution.

Answer

I found a way to do that, using python.

Note that the json stream you provided is not recognized by python json because of single quotes (and also some copy/paste problems, missing quotes, I fixed that).

My solution is using fully the python libraries, I doubt you can do the same with sed, that's why I provide it despite the fact you didn't mention that technology.

  • I read the data using ast.literal_eval since it's a list of dictionaries with the exact python syntax. Single quotes are not a problem for ast
  • I write the data using json.dump. It writes the data using double quotes.
  • Note that I write it in a "fake" file (i.e. a string with I/O write method to "fool" the json serializer).

Here's a standalone snippet that works:

import io

foo = """[{"categories": [['Movies & TV', 'Movies']], "title": "Understanding Seizures and Epilepsy DVD"},
{"title": "Who on Earth is Tom Baker?", "salesRank": {"Books": 3843450}, "categories": [['Books']]},
{"categories": [['Clothing, Shoes & Jewelry', 'Girls'], ['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'More Accessories', 'Kids & Baby']], "description": "description", "title": "Mog's Kittens", "salesRank": {"Books": 1760368}},
{"description": "Three Dr. Suess' Puzzles",
"brand": "Dr. Seuss", "categories": [['Toys & Games', 'Puzzles', 'Jigsaw Puzzles']]}
]"""

fp = io.StringIO()

json_data=ast.literal_eval(foo)
json.dump(json_data,fp)
print(fp.getvalue())

result:

[{"categories": [["Movies & TV", "Movies"]], "title": "Understanding Seizures and Epilepsy DVD"}, {"salesRank": {"Books": 3843450}, "categories": [["Books"]], "title": "Who on Earth is Tom Baker?"}, {"description": "description", "salesRank": {"Books": 1760368}, "categories": [["Clothing, Shoes & Jewelry", "Girls"], ["Clothing, Shoes & Jewelry", "Novelty, Costumes & More", "Costumes & Accessories", "More Accessories", "Kids & Baby"]], "title": "Mog's Kittens"}, {"brand": "Dr. Seuss", "description": "Three Dr. Suess' Puzzles", "categories": [["Toys & Games", "Puzzles", "Jigsaw Puzzles"]]}]

Here's a full script taking 2 parameters (input file & output file) and performing the conversion. You can use this script within your already existing bash scripts if you're not comfortable with python (save that in fix_quotes.py for instance):

import ast,json,sys

input_file = sys.argv[1]
output_file = sys.argv[2]

with open(input_file,"r") as fr:
    json_data=ast.literal_eval(fr.read())
with open(output_file,"w") as fw:
    json.dump(json_data,fw)