Edamame Edamame - 3 months ago 9
Python Question

Impala/SQL: select a sub-table and join

I am trying to use the following code to find the last month data in table_1, then left join it with table_2:

import pandas as pd

query = 'select * from table_1 where table_1.ts > "2016-07-12 00:00:00" as recent_table left join table_2 on table_1.t2__fk=table_2.id'

cursor = impala_con.cursor()
cursor.execute('USE my_db')
cursor.execute(query)
df_result = as_pandas(cursor)
df_result


but got the error below:

HiveServer2Error: AnalysisException: Syntax error in line 1:
...s > "2016-07-10 00:00:00" as recent_table left join table_2...
^
Encountered: AS
Expected: AND, BETWEEN, DIV, GROUP, HAVING, ILIKE, IN, IREGEXP, IS, LIKE, LIMIT, NOT, OFFSET, OR, ORDER, REGEXP, RLIKE, UNION

CAUSED BY: Exception: Syntax error


Does anyone know what I missed here? And what's the proper way to achieve this goal. Thanks!

Answer

That's cause your query syntax is wrong. You can't use alias for conditional statement as pointed below. Aliases are used only for table name and column name.

where table_1.ts > "2016-07-12 00:00:00" as recent_table

The correct query would be

select t1.* 
from table_1 t1
left join table_2 t2 on t1.t2__fk = t2.id
where t1.ts > "2016-07-12 00:00:00";