DTML DTML - 1 month ago 19
Python Question

Center of mass for pandas dataframe in Python

I am looking to find a center of mass for N-dimensional space in Python.I have a dataframe with K columns (some contain text and some contain numbers)

{X1...Xk}
...
{Z1..Zk}

k > 10000

I need to calculate center of mass for all numerical values in the dataframe.

What is the best way to do it?

Answer

The center of mass is simply the mean of the values on each dimension, and you just want to calculate it on non-object columns, so:

df.ix[:,df.dtypes != 'O'].mean()

EDIT: although the OP only mentioned "text" and "numbers", the following alternative is indeed more general (thanks MaxU):

df.select_dtypes(include=['number']).mean()