J.D. J.D. - 1 month ago 17
Python Question

Redshift: SqlAlchemy error on DateDiff call

I keep getting an error on a Redshift

DATEDIFF
call. I've succeeded with SQL Workbench when connected to the same Redshift server. Unfortunately, I'm not doing so hot using SQLAlchemy, and I was hoping I could get some help on here. Please let me know if you need more details.




With the Class Definition of
MyTable
,

class MyTable(Base):
__tablename__ = 'my_table'
__table_args__ = {
'schema': 'elm'
}
tstamp = Column(TIMESTAMP(timezone=True), nullable=False)


I'm trying to find the last twelve hours of tstamps on my_table where the value is older than 12 hours than the latest tstamp value. (
datediff(hour, max(tstamp), tstamp) > 12
)

# Get the most recent timestamp
max = session.execute(select([func.max(MyTable.tstamp)])).first()[0]

# Label the calculated column that will determine the difference
l = func.datediff('hour', max, MyTable.tstamp).label('z') # have also tried `text('hour')`

# Get the query
z = session.query(MyTable).filter(
MyTable.tstamp.in_(
select(
[distinct(MyTable.tstamp), l]
).where(
l >= 36
)
)
)


However, I keep getting this error:

sqlalchemy.exc.ProgrammingError:
(psycopg2.ProgrammingError)
function pg_catalog.date_diff("unknown", timestamp with time zone, timestamp with time zone) does not exist
HINT: No function matches the given name and argument types.
You may need to add explicit type casts.


Here's what my SQL looks like when I print to console. Using
SQL Workbench
for the
Redshift
server gives me the results I expect.

[SQL: '
SELECT elm.my_table.tstamp AS elm_my_table_tstamp
FROM elm.my_table
WHERE elm.my_table.tstamp IN (
SELECT DISTINCT elm.my_table.tstamp,
datediff(%(datediff_1)s, %(datediff_2)s, elm.my_table.tstamp) AS z
FROM elm.my_table
WHERE datediff(%(datediff_1)s, %(datediff_2)s, elm.my_table.tstamp) >= %(param_1)s)
']
[
parameters: {
'datediff_2': datetime.datetime(2016, 11, 2, 19, 15, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)),
'param_1': 12,
'datediff_1': 'hour'
}
]





Imports:

from my_package import MyTable
from sqlalchemy import select, func, distinct, extract, text





RESULT:

max = session.execute(select([func.max(MyTable.tstamp)])).first()[0]
session.execute(MyTable.__table__.delete().where(
MyTable.tstamp.in_(
select([distinct(MyTable.tstamp)]).where(
func.datediff('hour',
cast(MyTable.tstamp, TIMESTAMP(timezone=False)),
cast(max, TIMESTAMP(timezone=False)))
>= 12
)
)
))


(I wanted to delete the rows I found)

Answer

DATEDIFF does not support the timestamptz type. The timestamptz type is a recent addition to Redshift. You need to cast it to a timestamp instead, e.g. (not tested)

func.datediff("max", max.replace(tzinfo=None), cast(MyTable.tstamp, TIMESTAMP(timezone=False)))