DENDULURI CHAITANYA DENDULURI CHAITANYA - 2 months ago 14
Python Question

Identifying overlapping rows in multiple dataframes

I have two dataframes like

df1

Time accler
19.13.33 24
19.13.34 24
19.13.35 25
19.13.36 27
19.13.37 25
19.13.38 27
19.13.39 25
19.13.40 24


df2

Time accler
19.13.29 24
19.13.30 24
19.13.31 25
19.13.32 27
19.13.33 25
19.13.34 27
19.13.35 25
19.13.36 24


These two data frames overlap over column time from 19.13.33 to 19.13.36. So when ever there is a overlap I wanted only the dataframe which consists of the overlapped rows

expected output

df1

Time accler
19.13.33 24
19.13.34 24
19.13.35 25
19.13.36 27


df2

Time accler
19.13.33 25
19.13.34 27
19.13.35 25
19.13.36 24


or I can also have a
concat
of the dataframes which will be helpful for further processing.

I tried
merge
but did not work as the dataframes are created dynamically depending on the number of csv files. I tried concatenating first all the dataframes and tried to iterate over the rows but did not find a way.

Answer

You can use merge, default parameter how='inner' can be omited:

df = pd.merge(df1, df2, on='Time')
print (df)
       Time  accler_x  accler_y
0  19.13.33        24        25
1  19.13.34        24        27
2  19.13.35        25        25
3  19.13.36        27        24

df1 = df[['Time','accler_x']].rename(columns={'accler_x':'accler'})
print (df1)
       Time  accler
0  19.13.33      24
1  19.13.34      24
2  19.13.35      25
3  19.13.36      27

df2 = df[['Time','accler_y']].rename(columns={'accler_y':'accler'})
print (df2)
       Time  accler
0  19.13.33      25
1  19.13.34      27
2  19.13.35      25
3  19.13.36      24

If you need merge multiple DataFrames use reduce:

#Python 3
import functools

df = functools.reduce(lambda x,y: x.merge(y,on=['Time']), [df1, df2])

#python 2
df = reduce(lambda x,y: x.merge(y,on=['Time']), [df1, df2])
Comments