A. ALT A. ALT - 24 days ago 7
Python Question

How to extract numeric ranges from 2 columns containig numeric sequences and print the range from both columns (different increment values)?

I'm curently learnig python and pandas (this question is based on a pevious post but with an additional query); at the moment have the 2 columns containing numeric sequences (ascending and/or descending) as described below:

Col 1: (col1 numeric incrememt and/or decrement = 1)

1
2
3
5
7
8
9


Col 2: (Col2 numeric increment and/or decrement = 4)

113
109
105
90
94
98
102


Need to extract the numeric ranges from both columns and print them according to the sequence break occurance on any of those 2 columns and the result should be as follow:

1,3,105,113
5,5,90,90
7,9,94,102


Already received a very useful way to do it using python's pandas library by @MaxU where it generates the numeric ranges based on the breaks detected on both columns using a criteria of col1 and col2 = increase and/or decreases by 1.

How can I extract numeric ranges from 2 columns and print the range from both columns as tuples?

The unique difference on this case is that the increment/decrement criteria applied for both columns are different for each one of them.

Answer

Try this:

In [42]: df
Out[42]:
   Col1  Col2
0     1   113
1     2   109
2     3   105
3     5    90
4     7    94
5     8    98
6     9   102

In [43]: df.groupby(df.diff().abs().ne([1,4]).any(1).cumsum()).agg(['min','max'])
Out[43]:
  Col1     Col2
   min max  min  max
1    1   3  105  113
2    5   5   90   90
3    7   9   94  102

Explanation: our goal is to group those rows with the increment/decrement [1,4] for Col1, Col2 correspondingly:

In [44]: df.diff().abs()
Out[44]:
   Col1  Col2
0   NaN   NaN
1   1.0   4.0
2   1.0   4.0
3   2.0  15.0
4   2.0   4.0
5   1.0   4.0
6   1.0   4.0

In [45]: df.diff().abs().ne([1,4])
Out[45]:
    Col1   Col2
0   True   True
1  False  False
2  False  False
3   True   True
4   True  False
5  False  False
6  False  False

In [46]: df.diff().abs().ne([1,4]).any(1)
Out[46]:
0     True
1    False
2    False
3     True
4     True
5    False
6    False
dtype: bool

In [47]: df.diff().abs().ne([1,4]).any(1).cumsum()
Out[47]:
0    1
1    1
2    1
3    2
4    3
5    3
6    3
dtype: int32