DnP DnP - 14 days ago 5
Python Question

remove a substring from a string from a specific position python

I have a string like below (its actually a nested json),

{"a":"x","b":1,"c":"{"a":"x","b":1,"c":"{"a":"x","b":1,"c":"xa"}"}"}

and I am trying to extract a specific portion of the string (bold text). And, "xa" can be another nested json object.

So, the condition is always I need to extract the part of the string after the 9th occurrence of the " (quote character) till before the last occurrence of the ".

I have tried this

newstr = '{"a":"x","b":1,"c":"{"a":"x","b":1,"c":"{"a":"x","b":1,"c":"xa"}"}"}'
newstr2=newstr.split('"')[9:]+newstr.rsplit('"')[1:]
newstr3 = ''.join(newstr2)
print(newstr3)


its giving me the substring from the string, but, since I am splitting the string by '"' all the '"' from the entire string are removed. So, the result I am getting is like this - {a:x,b:1,c:{a:x,b:1,c:xa}} and I need the substring like - {"a":"x","b":1,"c":"{"a":"x","b":1,"c":"xa"}"}, otherwise it won't be a valid json object and I cannot use json.loads on the string.

I remember doing this before in other programming languages, VB and even Oracle stored procedures, basically I used a combination of substr and instr functions. Any idea how this can be achieved in python?

Answer

You have a JSON like string. I did not said JSON string because your nested elements contain "{ and }" which makes it invalid JSON format. In order to convert it into a valid JSON format, you need to replace these with { and } respectively. Then you may use json module to achieve what you want. For converting json string to dict/list, you may use json.loads() Here is the example:

>>> import json
>>> json_string = json_string.replace('"{', '{').replace('}"', '}')
>>> json_data = json.loads(json_string)   # convert JSON string to python object
>>> json_data['c']   # content of `c` key in `json_data` dict
{u'a': u'x', u'c': {u'a': u'x', u'c': u'xa', u'b': 1}, u'b': 1}

If you again want this data in the string format, you may use json.dumps() as:

>>> json.dumps(json_data['c'])
'{"a": "x", "c": {"a": "x", "c": "xa", "b": 1}, "b": 1}'
Comments