Jack Cooper - 11 months ago 74

Python Question

I have a Dataframe column which is a collection of lists

`a`

['a', 'b']

['b', 'a']

['a', 'c']

['c', 'a']

I would like to use this list to group by its unique values (['a', 'b'] & ['a', 'c']). However, this generates an error

`TypeError: unhashable type: 'list'`

Is there any way around this. Ideally I would like to sort the values in place and create an additional column of a concatenated string.

Answer Source

You can also sort values by column.

Example:

```
x = [['a', 'b'], ['b', 'a'], ['a', 'c'], ['c', 'a']]
df = pandas.DataFrame({'a': Series(x)})
df.a.sort_values()
a
0 [a, b]
2 [a, c]
1 [b, a]
3 [c, a]
```

However, for what I understand, you want to sort `[b, a]`

to `[a, b]`

, and `[c, a]`

to `[a, c]`

and then `set`

values in order to get only `[a, b][a, c]`

.

i'd recommend use `lambda`

Try:

```
result = df.a.sort_values().apply(lambda x: sorted(x))
result = DataFrame(result).reset_index(drop=True)
```

It returns:

```
0 [a, b]
1 [a, c]
2 [a, b]
3 [a, c]
```

Then get unique values:

```
newdf = pandas.DataFrame({'a': Series(list(set(result['a'].apply(tuple))))})
newdf.sort_values(by='a')
a
0 (a, b)
1 (a, c)
```