suitcase88 suitcase88 - 5 months ago 11
Python Question

python - extract dates from text by giving as parameter the date of reference which is not the current date

I have some fuzzy text which contain information about dates. For example: 'Concert this Saturday'. I want to extract the date that corresponds to "this Saturday" by giving the date of reference as parameter.
For example, suppose this is a subject of an email sent on 2016-04-13 and I want to get that "this Saturday" that this email was referring to, was on 2016-04-16. Do you know of any package that is able to do that?

P.S.I have used the dateutil.parser but this doesn't take a reference date as a parameter and it gives me as date the following Saturday from the date I'm running the code.


dateutil.parser.parse accepts a default parameter which you can use to specify a reference date:

import datetime as DT
import dateutil.parser as DP

today = DT.datetime(2016, 4, 13)
for text in ('today', 'tomorrow', 'this Sunday', 'Wednesday next week', 
             'next week Wednesday', 
             'next thursday', 'next tuesday in June', '11/28',
             'Concert this Saturday'
             "lunch with Andrew @ Mon Mar 7, 2016",
             'meeting on Tuesday, 3/29'):
    dp_date = DP.parse(text, default=today, fuzzy=True)
    print('{:35} --> {}'.format(text, dp_date))


today                               --> 2016-04-13 00:00:00
tomorrow                            --> 2016-04-13 00:00:00  should be 2016-04-14
this Sunday                         --> 2016-04-17 00:00:00
Wednesday next week                 --> 2016-04-13 00:00:00
next week Wednesday                 --> 2016-04-13 00:00:00
next thursday                       --> 2016-04-14 00:00:00
next tuesday in June                --> 2016-06-14 00:00:00  should be 2016-06-07
11/28                               --> 2016-11-28 00:00:00
Concert this Saturday               --> 2016-04-16 00:00:00
lunch with Andrew @ Mon Mar 7, 2016 --> 2016-03-07 00:00:00
meeting on Tuesday, 3/29            --> 2016-03-29 00:00:00

Note, however, that not all phrases are parsed correctly.