Anastasia Anastasia - 1 month ago 13
Python Question

Plotting unique values in pandas

I have a data frame where the first column is time and the second is a letter:

Time Letter
2016-10-05 20:46:12 'A'
2016-10-05 20:47:12 'A'
2016-10-05 20:50:12 'B'
2016-10-06 00:46:12 'A'
2016-10-06 01:46:12 'B'
2016-10-06 01:47:12 'C'
2016-10-06 02:46:12 'D'


I need to group the data by hour and count number of unique letters per hour:

Time Unique_values
2016-10-05 20 2
2016-10-06 00 1
2016-10-06 01 2
2016-10-06 00 1

df.groupby([df.index.date,df.index.hour]).Letter.nunique().plot(kind = 'bar', rot =0)


provides the plot with labels like (2016-10-05,7), (2016-10-05,8)...

Is there any way to remove the brackets and instead of 7, 8 etc. use 07:00:00, 08:00:00?

Answer

You can either use pd.Grouper:

df.groupby(pd.Grouper(key='Time', freq='H'))['Letter'].nunique()

Or set the time column as index and resample:

df.set_index('Time').resample('H')['Letter'].nunique()

Both will fill the missing interval with zeros. Since you are plotting, I guess you'd want that. If not, you can assign the resulting Series to a variable and filter:

ser = df.groupby(pd.Grouper(key='Time', freq='H'))['Letter'].nunique()
ser = ser[ser>0]
Comments