Afflatus - 5 months ago 111

Python Question

I have the following data which is recorded in seconds: http://pastebin.com/wBSJWYn2

I want to capture various summery statistics like the mean, variance, etc on it for 1 minute intervals. So I'm running these functions on

`sensor_data.rolling(window=1,freq="1MIN")`

- No output for incomplete minutes -- It doesn't given an output for minutes that don't have all 60 seconds. This is the case for the
`mean(), quantile(), sum()`

- No output at all. For certain functions like , I don't get any values at all. I really can't understand why this would be the case given that it was able to calculate the mean...
`var(), std(), kurt(), skew()`

Other functions seem to work without a problem:

`max(), median(), min()`

`sensor_data.head()`

x_acceleration y_acceleration z_acceleration heart_rate electrodermal_activity temperature

index

2016-05-16 06:58:44 -33.25000 -43.03125 33.09375 NaN 0.297099 33.33

2016-05-16 06:58:45 -28.15625 -52.90625 24.12500 NaN 0.219612 33.33

2016-05-16 06:58:46 -25.87500 -55.96875 21.18750 NaN 0.222648 33.33

2016-05-16 06:58:47 -24.00000 -57.46875 19.40625 NaN 0.217335 33.33

2016-05-16 06:58:48 -22.84375 -56.25000 23.40625 NaN 0.214300 33.33

Example output of the 1st case -- no output for incomplete minute:

`sensor_data.rolling(window=1,freq="1MIN").mean().head()`

x_acceleration y_acceleration z_acceleration heart_rate electrodermal_activity temperature

index

2016-05-16 06:58:00 NaN NaN NaN NaN NaN NaN

2016-05-16 06:59:00 -24.84375 -59.46875 9.03125 68.57 0.208988 33.75

2016-05-16 07:00:00 6.31250 -62.78125 6.46875 79.40 0.224924 33.84

2016-05-16 07:01:00 -21.18750 -57.00000 22.50000 92.00 0.224165 34.13

2016-05-16 07:02:00 -17.46875 -58.87500 21.84375 81.10 0.224165 34.25

Example output of the 2nd case -- no output:

`sensor_data.rolling(window=1,freq="1MIN").var().head()`

x_acceleration y_acceleration z_acceleration heart_rate electrodermal_activity temperature

index

2016-05-16 06:58:00 NaN NaN NaN NaN NaN NaN

2016-05-16 06:59:00 NaN NaN NaN NaN NaN NaN

2016-05-16 07:00:00 NaN NaN NaN NaN NaN NaN

2016-05-16 07:01:00 NaN NaN NaN NaN NaN NaN

2016-05-16 07:02:00 NaN NaN NaN NaN NaN NaN

Answer

for starters, this will get you going.

```
sensor_data.groupby(pd.Grouper(level=0, freq='Min')).describe()
```

you can build a custom function:

```
def stats(df):
kurt = pd.DataFrame(df.kurt(), columns=['kurt']).T
skew = pd.DataFrame(df.skew(), columns=['skew']).T
var = pd.DataFrame(df.var(), columns=['var']).T
return pd.concat([df.describe(), var, skew, kurt])
```

then:

```
sensor_data.groupby(pd.Grouper(level=0, freq='Min')).apply(stats)
```

Source (Stackoverflow)

Comments