bgbg bgbg - 6 months ago 12
Python Question

How to group pandas DataFrame entries by date in a non-unique column

A Pandas

DataFrame
contains column named "date" that contains non-unique
datetime
values.
I can group the lines in this frame using:

data.groupby(data['date'])


However, this splits the data by the
datetime
values. I would like to group these data by the year stored in the "date" column. This page shows how to group by year in cases where the time stamp is used as an index, which is not true in my case.

How do I achieve this grouping?

Answer

ecatmur's solution will work fine. This will be better performance on large datasets, though:

data.groupby(data['date'].map(lambda x: x.year))