gabboshow gabboshow -3 years ago 174
Python Question

intersection 2 pandas dataframe

in my problem I have 2 dataframes

mydataframe1
and
mydataframe2
as below.

mydataframe1
Out[13]:
Start End Remove
50 60 1
61 105 0
106 150 1
151 160 0
161 180 1
181 200 0
201 400 1


mydataframe2
Out[14]:
Start End
55 100
105 140
151 154
155 185
220 240


From
mydataframe2
I would like to remove the rows for which the interval Start-End are contained (also partially) in any of the
"Remove"
=1 intervals in
mydataframe1
. In other words there should not be any itnersection between the intervals of
mydataframe2
and each of the intervals in
mydataframe1


in this case mydataframe2 becomes

mydataframe2
Out[15]:
Start End
151 154

Answer Source

You could use pd.IntervalIndex for intersections

Get rows to be removed

In [313]: dfr = df1.query('Remove == 1')

Construct IntervalIndex from to be removed ranges

In [314]: s1 = pd.IntervalIndex.from_arrays(dfr.Start, dfr.End, 'both')

Construct IntervalIndex from to be tested

In [315]: s2 = pd.IntervalIndex.from_arrays(df2.Start, df2.End, 'both')

Select rows of s2 which are not in s1 ranges

In [316]: df2.loc[[x not in s1 for x in s2]]
Out[316]:
   Start  End
2    151  154

Details

In [320]: df1
Out[320]:
   Start  End  Remove
0     50   60       1
1     61  105       0
2    106  150       1
3    151  160       0
4    161  180       1
5    181  200       0
6    201  400       1

In [321]: df2
Out[321]:
   Start  End
0     55  100
1    105  140
2    151  154
3    155  185
4    220  240

In [322]: dfr
Out[322]:
   Start  End  Remove
0     50   60       1
2    106  150       1
4    161  180       1
6    201  400       1

IntervalIndex details

In [323]: s1
Out[323]:
IntervalIndex([[50, 60], [106, 150], [161, 180], [201, 400]]
              closed='both',
              dtype='interval[int64]')

In [324]: s2
Out[324]:
IntervalIndex([[55, 100], [105, 140], [151, 154], [155, 185], [220, 240]]
              closed='both',
              dtype='interval[int64]')

In [326]: [x not in s1 for x in s2]
Out[326]: [False, False, True, False, False]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download