Eric Hansen Eric Hansen - 1 year ago 89
Python Question

Finding intersection of values in a column associated with unique values in another column Pandas

If I have a DataFrame like this (very minimal example)

col1 col2
0 a 1
1 a 2
2 b 1
3 b 2
4 b 4
5 c 1
6 c 2
7 c 3

and I want the intersection of all
values when they are related to their unique
values (so in this case, the intersection would be
), how can I do so with Pandas? Another way to word this would be the values in
that exist for every unique value in

My (bad) solution was to get the unique
elements with
, and then build dictionaries from each unique element in
and then take the set intersection of those dictionary values. I feel like there is a mechanism I should be using to relate the columns together however that could make this much easier.

Answer Source

One way is to use pivot_table:

In [11]: cross = df.pivot_table(index="col1", columns="col2", aggfunc='size') == 1

In [12]: cross
col2     1     2      3      4
a     True  True  False  False
b     True  True  False   True
c     True  True   True  False

In [13]: cross.all()
1     True
2     True
3    False
4    False
dtype: bool

In [14]: cross.columns[cross.all()]
Out[14]: Int64Index([1, 2], dtype='int64', name='col2')