piRSquared - 5 months ago 10x
Python Question

# generate all quadratic combinations of any 2 columns

I have a

DataFrame
df
with columns
C1
,
C2
,
C3
,
C4
. I want a new
DataFrame
in which every combination of one column multiplied with the other is represented. This means in the case of 4 columns to start with, we should have
sum(4, 3, 2, 1) = 10
columns. Furthermore, the columns should be labeled as a
MultiIndex
where each level identifies one of the original columns being multiplied.

So if

df = pd.DataFrame(np.random.rand(2, 4) * 10, columns=['C1', 'C2', 'C3', 'C4']).astype(int)

print df

C1 C2 C3 C4
0 8 0 5 6
1 4 5 3 5

I expect
to look like:

C1 C2 C3 C4
C1 C2 C3 C4 C2 C3 C4 C3 C4 C4
0 64 0 40 48 0 0 0 25 30 36
1 16 20 12 20 25 15 25 9 15 25

try this:

from itertools import combinations, combinations_with_replacement

data = """\
C1  C2  C3  C4
0   8   0   5   6
1   4   5   3   5
"""

combs = list(combinations_with_replacement(df.columns.tolist(), 2))

for tup in combs:

Test:

Out[77]:
C1_C1  C1_C2  C1_C3  C1_C4  C2_C2  C2_C3  C2_C4  C3_C3  C3_C4  C4_C4
0     64      0     40     48      0      0      0     25     30     36
1     16     20     12     20     25     15     25      9     15     25

In [78]: combs
Out[78]:
[('C1', 'C1'),
('C1', 'C2'),
('C1', 'C3'),
('C1', 'C4'),
('C2', 'C2'),
('C2', 'C3'),
('C2', 'C4'),
('C3', 'C3'),
('C3', 'C4'),
('C4', 'C4')]