MeteoMtl MeteoMtl - 3 months ago 14
Python Question

Count unique dates in pandas dataframe

I have a dataframe of surface weather observations (

fzraHrObs
) organized by a station identifier code and date.
fzraHrObs
has several columns of weather data. The station code and date (datetime objects) look like:

usaf dat
716270 2014-11-23 12:00:00
2015-12-20 08:00:00
2015-12-20 09:00:00
2015-12-21 04:00:00
2015-12-28 03:00:00
716280 2015-12-19 08:00:00
2015-12-19 08:00:00


I would like to get a count of the number of unique dates (days) per year for each station - i.e. the number of days of obs per year at each station. In my example above this would give me:

usaf Year Count
716270 2014 1
2015 3
716280 2014 0
2015 1


I've tried using groupby and grouping by station, year, and date:
grouped = fzraHrObs['dat'].groupby(fzraHrObs['usaf'], fzraHrObs.dat.dt.year, fzraHrObs.dat.dt.date])


Count, size, nunique, etc. on this just gives me the number of obs on each date, not the number of dates themselves per year. Any suggestions on getting what I want here?

Answer

Could be something like this, group the date by usaf and year and then count the number of unique values:

import pandas as pd
df.dat.apply(lambda dt: dt.date()).groupby([df.usaf, df.dat.apply(lambda dt: dt.year)]).nunique()

#   usaf   dat 
# 716270  2014    1
#         2015    3
# 716280  2015    1
# Name: dat, dtype: int64