Til.N Til.N - 1 month ago 5
Python Question

Output Differ in Linux and Windows?

import pandas as pd
df = pd.read_csv('mydataset.csv', parse_dates=['Timestamp'] )
#print (df)

mask=(df['Timestamp'].dt.minute<10) & (df['Timestamp'].dt.hour==0)

df1 = df[mask]
print (df1)

df1= df1.set_index('Timestamp')
#print df1

df1= df1.resample('D').mean()
print (df1)


this is my Code for finding Average.

Output for Windows:-

Timestamp Temperature1 Temperature2
2016-09-01 53.80 45.80
2016-09-02 32.00 56.60
2016-09-03 30.80 58.30
2016-09-04 31.00 55.60
2016-09-05 31.10 55.60
2016-09-06 31.20 55.50
2016-09-07 30.80 54.90
2016-09-08 30.80 54.60
2016-09-09 31.40 55.10
2016-09-10 30.70 54.80
2016-09-11 31.00 54.60
2016-09-12 31.70 54.90
2016-09-13 31.10 54.70
2016-09-14 NaN NaN
2016-09-15 NaN NaN
2016-09-16 30.30 54.90
2016-09-17 NaN NaN
2016-09-18 31.00 64.60
2016-09-19 NaN NaN
2016-09-20 30.50 56.65
2016-09-21 30.10 56.40
2016-09-22 30.00 55.60
2016-09-23 30.30 56.30
2016-09-24 49.25 44.00
2016-09-25 51.50 47.10
2016-09-26 50.10 45.35
2016-09-27 50.25 48.00
2016-09-28 49.70 45.90
2016-09-29 51.05 48.15
2016-09-30 50.50 48.50


This is actual my desire output, but here also some of the dates are giving NaN value, not understanding why this is happening, because my data is proper and its giving NaN.

In linux machine the output is like

Temperature1 35.779053
temperature2 53.593647


giving combined Avg not individually datewise.

I want datewise AVG.
please help me with this.
I am using :-
python: 2.7.12

pandas: 0.17.1

Answer

For Pandas 0.17.1 you can do it this way:

df1.resample('D', how='mean')

PS the Resample API has been changed in Pandas 0.18.0...

In regards to NaN's - you can check how many entries per day you have:

df1.groupby(pd.TimeGrouper(freq='1D')).size()