sarthak sahni sarthak sahni - 3 days ago 5
Python Question

Filter database on basis of datetime value

I need to filter a pandas dataframe such that the posted_at (datetime) column value is within the last 200 days and the owner column value is a specific value that I give. I need another dataframe as the result of this query. How to achieve this?

enter image description here

Answer

You can use boolean indexing with date substract by Timedelta:

start = pd.to_datetime('2016-11-23 15:00:10')
rng = pd.date_range(start, periods=10)

df = pd.DataFrame({'posted_at': rng, 'owner': ['a'] * 5 + ['b'] * 5})  
print (df)
  owner           posted_at
0     a 2016-11-23 15:00:10
1     a 2016-11-24 15:00:10
2     a 2016-11-25 15:00:10
3     a 2016-11-26 15:00:10
4     a 2016-11-27 15:00:10
5     b 2016-11-28 15:00:10
6     b 2016-11-29 15:00:10
7     b 2016-11-30 15:00:10
8     b 2016-12-01 15:00:10
9     b 2016-12-02 15:00:10
now = pd.datetime.now().date()
print (now)
2016-11-29

#in real data change 5 to 200
last5 =  now - pd.Timedelta('5D')

#another solution
#last5 =  now - pd.offsets.Day(5)
print (last5)
2016-11-24

mask = (df.owner == 'a') & (df.posted_at > last5) & (df.posted_at < now)
print (mask)
0    False
1     True
2     True
3     True
4     True
5    False
6    False
7    False
8    False
9    False
dtype: bool

print (df[mask])
  owner           posted_at
1     a 2016-11-24 15:00:10
2     a 2016-11-25 15:00:10
3     a 2016-11-26 15:00:10
4     a 2016-11-27 15:00:10
Comments