Trexion Kameha - 1 year ago 54
Python Question

# Python - Summary Statistics using date and name

In python, I have time series data. The key of the data is date and name, and the data has 4 attributes: A, B, C and D.

I need to do some summary data analysis on this dataset:

1) For each name, average of A, B, C and D

2) For each name, standard deviation of A, B, C, and D

3) For each name, count number of NaN's as a percentage of total for each A, B, C, and D

I am familiar with R but not python. If you can point me in the right direction that would be more than enough! Thank you.

``````asof_dt = pd.date_range('20151231','20160130')
df1=pd.DataFrame(np.random.randn(len(asof_dt),4),index=asof_dt,columns=('A','B','C','D'))
df1['name']='alpha'
df2=pd.DataFrame(np.random.randn(len(asof_dt),4),index=asof_dt,columns=('A','B','C','D'))
df2['name']='beta'
df3=pd.DataFrame(np.random.randn(len(asof_dt),4),index=asof_dt,columns=('A','B','C','D'))
df3['name']='gama'
df_total = pd.concat([df1,df2,df3])
df_total[['name','A','B','C']]
``````

``````import pandas as pd