Araz Araz - 19 days ago 5
YAML Question

Dumping data from YAML file using Python

I have a YAML file like the following:

- workload:
name: cloud1
param:
p1: v1
p2: v2

- workload:
name: cloud2
param:
p1: v1
p2: v2


I can parse the file using the following Python script:

#!/usr/bin/env python

import yaml

try:
for key, value in yaml.load(open('workload.yaml'))['workload'].iteritems():
print key, value
except yaml.YAMLError as out:
print(out)


output:

name cloud1
param {'p1': 'v1'}


But I'm looking for is something like:

workload1 = cloud1
workload1_param_p1 = v1
workload1_param_p2 = v2

workload2 = cloud2
workload2_param_p1 = v1
workload2_param_p2 = v2

Answer

Your output doesn't match your input as the toplevel of your YAML file is a sequence that maps to a Python list.
The other thing not entirely clear is where the workload and especially the 1 in workload1 come from. In the following I have assumed they come from the key of the mapping that constitutes the sequence elements resp. the position of that sequence element (starting at 1, hence the idx+1). The name is popped from a copy of the values, so that the rest can be recursively dumped correctly:

import sys
import ruamel.yaml

yaml_str = """\
- workload:
    name: cloud1
    param:
      p1: v1
      p2: v2

- workload:
    name: cloud2
    param:
      p1: v1
      p2: v2
"""

data = ruamel.yaml.round_trip_load(yaml_str)

def dump(prefix, d, out):
    if isinstance(d, dict):
        for k in d:
            dump(prefix[:] + [k], d[k], out)
    else:
        print('_'.join(prefix), '=', d, file=out)

for idx, workload in enumerate(data):
    for workload_key in workload:
        values = workload[workload_key].copy()
        # alternatively extract from values['name']
        workload_name = '{}{}'.format(workload_key, idx+1)
        print(workload_name, '=', values.pop('name'))
        dump([workload_name], values, sys.stdout)
    print()

gives:

workload1 = cloud1
workload1_param_p1 = v1
workload1_param_p2 = v2

workload2 = cloud2
workload2_param_p1 = v1
workload2_param_p2 = v2

This was done using ruamel.yaml, a YAML 1.2 parser, of which I am the author. If you only have YAML 1.1 code (as supported by PyYAML) you should still use ruamel.yaml as its round_trip_loader guarantees that your workload_param_p1 is printed before workload_param_p2 (with PyYAML that is not guaranteed).