tom tom - 5 months ago 27
Python Question

Table function for two variable in python

I have a data frame like this.

c_name p_name
A X
B Y
B A1
C ZX
D G4
D H9


I want frequency for each c_name with each p_name.
I am getting proper output in R by using

data.frame(table(df1$c_name,df1$p_name))


But in python if I am applying
pd.crosstab(df1['c_name'],df1['p_name'])
it is giving the result but not in proper format.

My expectation is:

c_name p_name Freq
A X 1
B X 0
B X 0
C X 0
D X 0
D X 0
A Y 0
B Y 1
B Y 0
C Y 0
D Y 0
D Y 0 ..........so on.


Thanks in advance.

Answer Source
pd.crosstab(df['c_name'], df['p_name']).stack().reset_index(name='Freq')

This will give:

   c_name p_name  Freq
0       A     A1     0
1       A     G4     0
2       A     H9     0
3       A      X     1
4       A      Y     0
5       A     ZX     0
6       B     A1     1
7       B     G4     0
8       B     H9     0
9       B      X     0
10      B      Y     1
11      B     ZX     0
12      C     A1     0
13      C     G4     0
14      C     H9     0
15      C      X     0
16      C      Y     0
17      C     ZX     1
18      D     A1     0
19      D     G4     1
20      D     H9     1
21      D      X     0
22      D      Y     0
23      D     ZX     0