piRSquared - 2 years ago 86
Python Question

# generate all quadratic combinations of any 2 columns

I have a

`DataFrame`
`df`
with columns
`C1`
,
`C2`
,
`C3`
,
`C4`
. I want a new
`DataFrame`
in which every combination of one column multiplied with the other is represented. This means in the case of 4 columns to start with, we should have
`sum(4, 3, 2, 1) = 10`
columns. Furthermore, the columns should be labeled as a
`MultiIndex`
where each level identifies one of the original columns being multiplied.

So if

``````df = pd.DataFrame(np.random.rand(2, 4) * 10, columns=['C1', 'C2', 'C3', 'C4']).astype(int)

print df

C1  C2  C3  C4
0   8   0   5   6
1   4   5   3   5
``````

I expect
`df_quad`
to look like:

``````   C1              C2          C3      C4
C1  C2  C3  C4  C2  C3  C4  C3  C4  C4
0  64   0  40  48   0   0   0  25  30  36
1  16  20  12  20  25  15  25   9  15  25
``````

try this:

``````from itertools import combinations, combinations_with_replacement

data = """\
C1  C2  C3  C4
0   8   0   5   6
1   4   5   3   5
"""

combs = list(combinations_with_replacement(df.columns.tolist(), 2))

for tup in combs:
``````

Test:

``````In [77]: df_quad
Out[77]:
C1_C1  C1_C2  C1_C3  C1_C4  C2_C2  C2_C3  C2_C4  C3_C3  C3_C4  C4_C4
0     64      0     40     48      0      0      0     25     30     36
1     16     20     12     20     25     15     25      9     15     25

In [78]: combs
Out[78]:
[('C1', 'C1'),
('C1', 'C2'),
('C1', 'C3'),
('C1', 'C4'),
('C2', 'C2'),
('C2', 'C3'),
('C2', 'C4'),
('C3', 'C3'),
('C3', 'C4'),
('C4', 'C4')]
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download