zezollo zezollo - 1 year ago 48
Python Question

Parsing YAML, get line numbers even in ordered maps

I need to get the line numbers of certain keys of a YAML file.

Please note, this answer does not solve the issue: I do use ruamel.yaml, and the answers do not work with ordered maps.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from ruamel import yaml

data = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")

print(data)


As a result I get this:

CommentedMap([('key1', CommentedOrderedMap([('key2', 'item2'), ('key3', 'item3'), ('key4', CommentedOrderedMap([('key5', 'item5'), ('key6', 'item6')]))]))])


what does not allow to access to the line numbers, except for the
!!omap
keys:

print(data['key1'].lc.line) # output: 1
print(data['key1']['key4'].lc.line) # output: 4


but:

print(data['key1']['key2'].lc.line) # output: AttributeError: 'str' object has no attribute 'lc'


Indeed,
data['key1']['key2]
is a
str
.

I've found a workaround:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from ruamel import yaml

DATA = yaml.round_trip_load("""
key1: !!omap
- key2: item2
- key3: item3
- key4: !!omap
- key5: item5
- key6: item6
""")


def get_line_nb(data):
if isinstance(data, dict):
offset = data.lc.line
for i, key in enumerate(data):
if isinstance(data[key], dict):
get_line_nb(data[key])
else:
print('{}|{} found in line {}\n'
.format(key, data[key], offset + i + 1))


get_line_nb(DATA)


output:

key2|item2 found in line 2

key3|item3 found in line 3

key5|item5 found in line 5

key6|item6 found in line 6


but this looks a little bit "dirty". Is there a more proper way of doing it?

Answer Source

This issue is not that you are using !omap and that it doesn't give you the line-numbers as with "normal" mappings. That should be clear from the fact that you get 4 from doing print(data['key1']['key4'].lc.line) (where key4 is a key in the outer !omap).

As this answers indicates,

you can access the property lc on collection items

The value for data['key1']['key4'] is a collection item (another !omap), but the value for data['key1']['key2'] is not a collection item but a, built-in, python string, which has no slot to store the lc attribute.

To get an .lc attribute on a non-collection like a string, is non-trivial: you would have to subclass the RoundTripConstructor, to use something like the classes in scalarstring.py (with __slots__ adjusted to accept the lc attribute and then transfer the line information available in the nodes to that attribute.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download