S Ringne S Ringne - 17 days ago 5
Python Question

total no. of combinations of a column with other in pandas df

i have a table in pandas df

id_x id_y
a b
b c
c d
d a
b a
and so on around (1000 rows)


i want to find the count of combinations for each id_x with id_y.

ie. a has combinations with
a-b,d-a(total 2 combinations)

similarly b has total
2 combinations(b-c) and also a-b to be considered as a combination for b( a-b = b-a)


and create a dataframe df2 which has

id combinations
a 2
b 2
c 2 #(c-d and b-c)
d 1
and so on ..(distinct product_id_'s)


i tried doing this code

df.groupby(['id_x']).size().reset_index()


but getting wrong result;

id_x 0
0 a 1
1 b 1
2 c 1
3 d 1


what approach should i follow?
my skills on python are at a beginner level.
Thanks in advance.

Answer

You can first sort all rows by apply sorted, then create Series by stack and last value_counts:

df = df.apply(sorted,axis=1).drop_duplicates().stack().value_counts()
print (df)
d    2
a    2
b    2
c    2
dtype: int64