javadba javadba - 3 months ago 16
Scala Question

Why does using rank() windowing function break the parser?

The windowing functions online docs for spark sql include the following example:

https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html

SELECT
product,
category,
revenue
FROM (
SELECT
product,
category,
revenue,
dense_rank() OVER (PARTITION BY category ORDER BY revenue DESC) as rank
FROM productRevenue) tmp
WHERE
rank <= 2


I have created what would seem to be a similar structure sql. But it does not work

select id,r from (
select id, name,
rank() over (partition by name order by name) as r
from tt) v
where v.r >= 7 and v.r <= 12


Here is the error:

Exception in thread "main" java.lang.RuntimeException: [3.25]
failure: ``)'' expected but `(' found

rank() over (partition by fp order by fp) as myrank
^


Anyone can see where they differ structurally? I am on spark 1.6.0-SNAPSHOT from 11/18/15.

Answer

I checked the source code and it appears the rank() requires hive support. I am rebuilding spark with

 -Phive -Phive-thriftserver

I did confirm: when using a HiveContext the query works.

Comments