hTech hTech - 3 months ago 34
Python Question

Python filter a DBF by a range of date (between two dates)

I'm using the dbf library with python3.5.
The DBF table has a column with only dates without time and another just with time. Want to retrieve records from the last five minutes.

I'm new to this module and currently see just two approaches to get a portion of the data stored in a DBF:

First, with the sympathetic SQL like query:

records = table.query("SELECT * WHERE (SA03 BETWEEN " + beforedfilter + " AND " + nowdfilter + ") AND (SA04 BETWEEN " + beforetfilter + " AND " + nowtfilter + ")")


This would be a familiar approach but the records returned are the first records from the file and not between the given range of time. Probably it is because the sql querying is not well supported by the module? Or just I'm mistaking something in my query? And another odd is that after a few records are printed I'll get an exception:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 3: ordinal not in range(128)
. To my knowledge there are no non-ascii characters in the table.

The other approach is using the module's default way of narrowing down records.. Got stuck with the filtering, as I could use it if I would want to find one specific date and time but for a range, I have no clues how to proceed.

index = table.create_index(lambda rec: rec.SA03)
records = index.search(match=(?!))

Answer

The simplest way is to have a filter function that only tracks matching records:

# untested, but the idea is sound ;)
def last_five_minutes(record, date_field, time_field):
    now = dbf.DateTime.now()
    if (
        record[date_field] is not None
        and record[date_field] == now.date()
        and record[time_field] is not None
        and now.time()-record[time_field] <= datetime.timedelta(seconds=300)
       ):
            return record
    else:
        return dbf.DoNotIndex

and then use it:

index = table.create_index(
        lambda rec: last_five_minutes(rec, 'date_field', 'time_field'))