Alistair Alistair - 4 months ago 18
Python Question

Parsing different date formats from feedparser in python?

I'm trying to get the dates from entries in two different RSS feeds through feedparser.

Here is what I'm doing:

import feedparser as fp
reddit = fp.parse("http://www.reddit.com/.rss")
cc = fp.parse("http://contentconsumer.com/feed")
print reddit.entries[0].date
print cc.entries[0].date


And here's how they come out:

2008-10-21T22:23:28.033841+00:00

Wed, 15 Oct 2008 10:06:10 +0000


I want to get to the point where I can find out which is newer easily.

I've tried using the datetime module of Python and searching through the feedparser documentation, but I can't get past this problem. Any help would be much appreciated.

Answer

Parsing of dates is a pain with RSS feeds in-the-wild, and that's where feedparser can be a big help.

If you use the *_parsed properties (like updated_parsed), feedparser will have done the work and will return a 9-tuple Python date in UTC.

See http://packages.python.org/feedparser/date-parsing.html for more gory details.