How can I create a boxplot for a pandas time-series where I have a box for each day?
Sample dataset of hourly data where one box should consist of 24 values:
import pandas as pd
n = 480
ts = pd.Series(randn(n),
If its an option for you, i would recommend using Seaborn, which is a wrapper for Matplotlib. You could do it yourself by looping over the groups from your timeseries, but that's much more work.
import pandas as pd import seaborn import matplotlib.pyplot as plt n = 480 ts = pd.Series(np.random.randn(n), index=pd.date_range(start="2014-02-01", periods=n, freq="H")) fig, ax = plt.subplots(figsize=(12,5)) seaborn.boxplot(ts, ts.index.dayofyear, ax=ax)
Note that i'm passing the
day of year as the
grouper to seaborn, if your data spans multiple years this wouldn't work. You could then consider something like:
ts.index.to_series().apply(lambda x: x.strftime('%Y%m%d'))
Edit, for 3-hourly you could use this as a grouper, but it only works if there are no minutes or lower defined. :
[(dt - datetime.timedelta(hours=int(dt.hour % 3))).strftime('%Y%m%d%H') for dt in ts.index]