S Ringne - 17 days ago 5

Python Question

i have a table in pandas df

`id_x id_y`

a b

b c

c d

d a

b a

and so on around (1000 rows)

i want to find the count of combinations for each id_x with id_y.

ie. a has combinations with

`a-b,d-a(total 2 combinations)`

similarly b has total

`2 combinations(b-c) and also a-b to be considered as a combination for b( a-b = b-a)`

and create a dataframe df2 which has

`id combinations`

a 2

b 2

c 2 #(c-d and b-c)

d 1

and so on ..(distinct product_id_'s)

i tried doing this code

`df.groupby(['id_x']).size().reset_index()`

but getting wrong result;

`id_x 0`

0 a 1

1 b 1

2 c 1

3 d 1

what approach should i follow?

my skills on python are at a beginner level.

Thanks in advance.

Answer

You can first sort all rows by `apply`

`sorted`

, then create `Series`

by `stack`

and last `value_counts`

:

```
df = df.apply(sorted,axis=1).drop_duplicates().stack().value_counts()
print (df)
d 2
a 2
b 2
c 2
dtype: int64
```