lte__ lte__ - 1 year ago 237
Java Question

Spark - Group by HAVING with dataframe syntax?

What's the syntax for using a groupby-having in Spark without an sql/hiveContext? I know I can do

DataFrame df = some_df
df1 = sqlContext.sql("SELECT * FROM df GROUP BY col1 HAVING some stuff")

but how do I do it with a syntax like

df ="*")).groupBy(df.col("col1")).having("some stuff")
does not seem to exist.

Answer Source

Yes, it doesn't exist. You express the same logic with agg followed by where:

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download