Miyashita Hikaru Miyashita Hikaru - 3 months ago 6
Python Question

Complicated refer to another table

I have dataframe shown in below:
column name 'Types'shows each types dified

I would like to add another column named 'number' defined as below.

df=pd.DataFrame({'Sex':['M','F','F','M'],'Age':[30,31,33,32],'Types':['A','C','B','D']})

Out[8]:

Age Sex Types
0 30 M A
1 31 F C
2 33 F B
3 32 M D


and I have another male table below;
each column represents Types!

(It was difficult to create table for me, Are there another easy way to create?)

table_M = pd.DataFrame(np.arange(20).reshape(4,5),index=[30,31,32,33],columns=["A","B","C","D","E"])
table_M.index.name="Age(male)"

A B C D E
Age(male)
30 0 1 2 3 4
31 5 6 7 8 9
32 10 11 12 13 14
33 15 16 17 18 19


and I have female table below;

table_F = pd.DataFrame(np.arange(20,40).reshape(4,5),index=[30,31,32,33],columns=["A","B","C","D","E"])
table_F.index.name="Age(female)"

A B C D E
Age(female)
30 20 21 22 23 24
31 25 26 27 28 29
32 30 31 32 33 34
33 35 36 37 38 39


so I would like to add 'number' column as shown below;

Age Sex Types number
0 30 M A 0
1 31 F C 27
2 33 F B 36
3 32 M D 13


this number column refer to female and male table. for each age , Type, and Sex.
It was too complicated for me.
Can I ask how to add 'number' column?

Answer

I suggest reshaping your male and female tables:

males = (table_M.stack().to_frame('number').assign(Sex='M').reset_index()
                .rename(columns={'Age(male)': 'Age', 'level_1': 'Types'}))

females = (table_F.stack().to_frame('number').assign(Sex='F').reset_index()
                  .rename(columns={'Age(female)': 'Age', 'level_1': 'Types'}))

reshaped = pd.concat([males, females], ignore_index=True)

Then merge:

df.merge(reshaped)
Out: 
   Age Sex Types  number
0   30   M     A       0
1   31   F     C      27
2   33   F     B      36
3   32   M     D      13

What this does is it stacks the columns of Male and Female tables, and assigns an indicator column showing Sex ('M' and 'F'). females.head() looks like this:

females.head()
Out: 
   Age Types  number Sex
0   30     A      20   F
1   30     B      21   F
2   30     C      22   F
3   30     D      23   F
4   30     E      24   F

and males.head():

males.head()
Out: 
   Age Types  number Sex
0   30     A       0   M
1   30     B       1   M
2   30     C       2   M
3   30     D       3   M
4   30     E       4   M

With pd.concat these two are combined into a single DataFrame and merge by default works on the common columns so it looks for the matches in 'Age', 'Sex', 'Types' columns and merge two DataFrames based on that.


One other possibility is to use df.lookup:

df.loc[df['Sex']=='M', 'number'] = table_M.lookup(*df.loc[df['Sex']=='M', ['Age', 'Types']].values.T)
df.loc[df['Sex']=='F', 'number'] = table_F.lookup(*df.loc[df['Sex']=='F', ['Age', 'Types']].values.T)

df
Out: 
   Age Sex Types  number
0   30   M     A     0.0
1   31   F     C    27.0
2   33   F     B    36.0
3   32   M     D    13.0

This looks up the males in table_M, and females in table_F.

Comments