bikuser bikuser - 2 months ago 13
Python Question

Is there any column match or row match function in python?

I have two data frame lets say:

dataframe A with column 'name'

name
0 4
1 2
2 1
3 3


Another dataframe B with two columns i.e. name and value

name value
0 3 5
1 2 6
2 4 7
3 1 8


I want to rearrange the value in dataframe B according to the name column in dataframe A

I am expecting final dataframe similar to this:

name value
0 4 7
1 2 6
2 1 8
3 3 5

Answer

Here are two options:

dfB.set_index('name').loc[dfA.name].reset_index()
Out: 
   name  value
0     4      7
1     2      6
2     1      8
3     3      5

Or,

dfA['value'] = dfA['name'].map(dfB.set_index('name')['value'])

dfA
Out: 
   name  value
0     4      7
1     2      6
2     1      8
3     3      5

Timings:

import numpy as np
import pandas as pd
prng = np.random.RandomState(0)
names = np.arange(10**7)
prng.shuffle(names)
dfA = pd.DataFrame({'name': names})
prng.shuffle(names)
dfB = pd.DataFrame({'name': names, 'value': prng.randint(0, 100, 10**7)})

%timeit dfB.set_index('name').loc[dfA.name].reset_index()
1 loop, best of 3: 2.27 s per loop

%timeit dfA['value'] = dfA['name'].map(dfB.set_index('name')['value'])
1 loop, best of 3: 1.65 s per loop

%timeit dfB.set_index('name').ix[dfA.name].reset_index()
1 loop, best of 3: 1.66 s per loop
Comments