keisuke - 1 year ago 47

JSON Question

Suppose I have an array:

`[['a', 10, 1, 0.1],`

['a', 10, 2, 0.2],

['a', 20, 2, 0.3],

['b', 10, 1, 0.4],

['b', 20, 2, 0.5]]

And I want a

`dict`

`{`

'a': {

10: {1: 0.1, 2: 0.2},

20: {2: 0.3}

}

'b': {

10: {1: 0.4},

20: {2: 0.5}

}

}

Is there any good way or some library for this task?

In this example the array is just 4-column, but my original array is more complicated (7-column).

Currently I implement this naively:

`import pandas as pd`

df = pd.DataFrame(array)

grouped1 = df.groupby('column1')

for column1 in grouped1.groups:

group1 = grouped1.get_group(column1)

grouped2 = group1.groupby('column2')

for column2 in grouped2.groups:

group2 = grouped2.get_group(column2)

...

And

`defaultdict`

`d = defaultdict(lambda x: defaultdict(lambda y: defaultdict ... ))`

for row in array:

d[row[0]][row[1]][row[2]... = row[-1]

But I think neither is smart.

Answer Source

Here is a recursive solution. The base case is when you have a list of 2-element lists (or tuples), in which case, the `dict`

will do what we want:

```
>>> dict([(1, 0.1), (2, 0.2)])
{1: 0.1, 2: 0.2}
```

For other cases, we will remove the first column and recurse down until we get to the base case.

```
from itertools import groupby
def rows2dict(rows):
if len(rows[0]) == 2:
# e.g. [(1, 0.1), (2, 0.2)] ==> {1: 0.1, 2: 0.2}
return dict(rows)
else:
dict_object = dict()
for column1, groupped_rows in groupby(rows, lambda x: x[0]):
rows_without_first_column = [x[1:] for x in groupped_rows]
dict_object[column1] = rows2dict(rows_without_first_column)
return dict_object
if __name__ == '__main__':
rows = [['a', 10, 1, 0.1],
['a', 10, 2, 0.2],
['a', 20, 2, 0.3],
['b', 10, 1, 0.4],
['b', 20, 2, 0.5]]
dict_object = rows2dict(rows)
print dict_object
```

```
{'a': {10: {1: 0.1, 2: 0.2}, 20: {2: 0.3}}, 'b': {10: {1: 0.4}, 20: {2: 0.5}}}
```

- We use the
`itertools.groupby`

generator to simplify grouping of similar rows based on the first column - For each group of rows, we remove the first column and recurse down
- This solution assumes that the
`rows`

variable has 2 or more columns. The result is unpreditable for rows which has 0 or 1 column.