Arnold Klein Arnold Klein - 1 year ago 119
Python Question

Pair-wise testing statistical significance on pandas data frame

I have a pandas dataframe (100x10), where each column represents some quantity and I would like to pair-wise test all columns using t-test. Instead of looping over the columns:

stats.ttest_rel(df.iloc[:,i], df.iloc[:,j])


where
i!=j
, is there a cleaner way to do it? Something similar to correlations:

df.corr()


where it computes all pair-wise correlations.

Answer Source

No need to do a double for-loop yourself. You can use itertools.combinations

results = pd.DataFrame(columns=df.columns, index=df.columns)
for (label1, column1), (label2, column2) in itertools.combinations(df.items(), 2):
    results.loc[label1, label2] = results.loc[label2, label1] = stats.ttest_rel(column1, column2)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download