I am trying to use the following code to find the last month data in table_1, then left join it with table_2:
import pandas as pd
query = 'select * from table_1 where table_1.ts > "2016-07-12 00:00:00" as recent_table left join table_2 on table_1.t2__fk=table_2.id'
cursor = impala_con.cursor()
df_result = as_pandas(cursor)
HiveServer2Error: AnalysisException: Syntax error in line 1:
...s > "2016-07-10 00:00:00" as recent_table left join table_2...
Expected: AND, BETWEEN, DIV, GROUP, HAVING, ILIKE, IN, IREGEXP, IS, LIKE, LIMIT, NOT, OFFSET, OR, ORDER, REGEXP, RLIKE, UNION
CAUSED BY: Exception: Syntax error
That's cause your query syntax is wrong. You can't use alias for conditional statement as pointed below. Aliases are used only for table name and column name.
where table_1.ts > "2016-07-12 00:00:00" as recent_table
The correct query would be
select t1.* from table_1 t1 left join table_2 t2 on t1.t2__fk = t2.id where t1.ts > "2016-07-12 00:00:00";