piRSquared piRSquared - 3 months ago 22
Python Question

roll value within grouped regions in a dataframe

consider the


idx = map('first {}'.format, range(2)) + map('last {}'.format, range(3))
df = pd.DataFrame(np.arange(25).reshape(5, -1), idx, idx)

enter image description here

I want to group the dataframe into four quadrants based on the text in the row and column headers. Meaning that the upper left quadrant consists of columns with
and rows with
. The upper right quadrant consists of columns with
and rows with
and so on.

Then within each group, I want to

  • roll each element one to right if it can

  • otherwise start on next row at the beggining if it can

  • otherwise start at the very beginning

This should help illustrate

enter image description here

The expected output should look like this.

enter image description here


Using a nested groupby-apply pattern and np.roll. Perform a groupby on the columns, followed by a groupby on the index to get the desired subgroups to roll. Then use np.roll to perform the roll, wrapping the output in a DataFrame since np.roll only returns an array.

def roll_frame(df, shift):
    return pd.DataFrame(np.roll(df, shift), index=df.index, columns=df.columns)

# Groupers for the index and the columns.
idx_groups = df.index.map(lambda x: x.split()[0])
col_groups = df.columns.map(lambda x: x.split()[0])

# Nested groupby, then perform the roll..
df = df.groupby(col_groups, axis=1) \
       .apply(lambda grp: grp.groupby(idx_groups).apply(roll_frame, 1))

Kind of gross, but gets the job done. The order in which you perform the nested groupby doesn't really matter.

The resulting output:

         first 0  first 1  last 0  last 1  last 2
first 0        6        0       9       2       3
first 1        1        5       4       7       8
last 0        21       10      24      12      13
last 1        11       15      14      17      18
last 2        16       20      19      22      23