sebastian sebastian - 16 days ago 5
Python Question

Convert structured string to dictionary

virtualmin domain-list --multiline
returns a structured string which I would like to convert to a list of dicts of dicts.

The string looks like this (there are sometimes missing values):

do.ma.in.1
key1a: value1a
key1b: value1b
key1c:
...
do.ma.in.2
key2a: value2a
key2b: value2b
...
...


(the key: value pairs are indented by 4 spaces in the string)

which I would like to convert into this form:

[do.ma.in.1: {key1a: value1a, key1b: value1b, key1c: None ...},
do.ma.in.2: {key2a: value2a, key2b: value2b, ...}, ...


So far I did split the string with
re.split("\s*(?=^\S)", str)
which got me

[do.ma.in.1\n key1a: value1a\n key1b: value1b\n key1c:\n ...,
do.ma.in.2\n key2a: value2a\n key2b: value2b\n ..., ...


where the list items are just strings. (so no actual dictionary items)

Where do I go from there?

Answer

If it were me, I wouldn't use re. I would step through the data, line by line, assigning the values to the appropriate dict as I go:

from pprint import pprint

data = '''
do.ma.in.1
    key1a: value1a
    key1b: value1b
    key1c:
do.ma.in.2
    key2a: value2a
    key2b: value2b
'''

key = None
result = {}
for line in data.splitlines():
    if not line.strip():
        continue
    line = line.split(':', 1)
    if len(line) == 1:
        key = line[0].strip()
        result[key] = {}
    elif len(line) == 2:
        result[key][line[0].strip()] = line[1].strip()

pprint(result)

Result:

{'do.ma.in.1': {'key1a': 'value1a', 'key1b': 'value1b', 'key1c': ''},
 'do.ma.in.2': {'key2a': 'value2a', 'key2b': 'value2b'}}
Comments