Arnold Klein Arnold Klein - 2 years ago 147
Python Question

Pair-wise testing statistical significance on pandas data frame

I have a pandas dataframe (100x10), where each column represents some quantity and I would like to pair-wise test all columns using t-test. Instead of looping over the columns:

stats.ttest_rel(df.iloc[:,i], df.iloc[:,j])

, is there a cleaner way to do it? Something similar to correlations:


where it computes all pair-wise correlations.

Answer Source

No need to do a double for-loop yourself. You can use itertools.combinations

results = pd.DataFrame(columns=df.columns, index=df.columns)
for (label1, column1), (label2, column2) in itertools.combinations(df.items(), 2):
    results.loc[label1, label2] = results.loc[label2, label1] = stats.ttest_rel(column1, column2)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download