vrjr vrjr - 1 month ago 13
MySQL Question

How to filter datetime field +/- datetime using SQLalchemy

I have a mysql table representing editorial articles and their metadata like title, author, and datecreated.

I have another table representing metrics (such as view counts) about those articles computed at different time points. Each row is a recording of these metrics for a particular article at a particular moment in time.

I want to retrieve all rows of the metrics table where the metric row timestamp field is within a period of two hours occurring after one hour past the related article's datecreated field. I'd like to do this using SQLalchemy.

My current SQLalchemy query looks like this:

import sqlalchemy as sa
import models as m

s = session()
q = (s.query(m.Article.fb_shares, func.avg(m.ArticlesMetric.views)),
.join(m.ArticlesMetric)
.filter(sa.between(m.ArticlesMetric.tstamp,
m.Article.created + timedelta(hours=1),
m.Article.created + timedelta(hours=3))
)
.group_by(m.Article.id))

result = q.all()
s.close()


However, this results in the following error:

Warning: (1292, u"Truncated incorrect DOUBLE value: '1970-01-01 05:30:00'")


mySQL internally casts data of different types to doubles before making a comparison when a comparison is attempted between different types. I believe this error is somehow a result of using the timedelta, but I'm not sure how else I can achieve what I'm trying to do. Any suggestions very welcome.

Answer

Actually, this is harder than it looks. If you had done this in MySQL directly, this is what you would have written:

SELECT ...
FROM ...
JOIN ...
WHERE tstamp BETWEEN DATE_ADD(created, INTERVAL 1 HOUR) AND DATE_ADD(created, INTERVAL 3 HOUR)
GROUP BY ...

And you have to do more or less the same thing with SQLAlchemy, simply because m.Article.created is not a constant.

If you enable query logging, you can see the MySQL query generated by your code, and see that it does not correspond to what you would have thought:

INFO:sqlalchemy.engine.base.Engine:SELECT test.id AS test_id, test.dt AS test_dt, test.tp AS test_tp 
FROM test 
WHERE test.tp BETWEEN test.dt + %(dt_1)s AND test.dt + %(dt_2)s
INFO:sqlalchemy.engine.base.Engine:{'dt_1': datetime.datetime(1970, 1, 1, 1, 0), 'dt_2': datetime.datetime(1970, 1, 1, 3, 0)}

I managed to find a way to do what you want, here is the code:

from sqlalchemy.sql import func
from sqlalchemy.sql.expression import text

...
filter(sa.between(m.ArticlesMetric.tstamp,
     func.date_add(m.Article.created, text('INTERVAL 1 HOUR')),
     func.date_add(m.Article.created, text('INTERVAL 3 HOUR')))
Comments