Boris Boris - 9 months ago 280
Scala Question

How to use DataFrame filter with isin in Spark Java?

I'm trying to filter a Spark DataFrame using a list in Java.

java.util.List<Long> selected = ....;
DataFrame result = df.filter(df.col("something").isin(????));


The problem is that isin(...) method accepts Scala Seq or Scala varang.

Passing in JavaConversions.asScalaBuffer(selected) doesn't work either.

Any ideas?

Answer Source

you can use something like this.

df.filter(col("something").isin("valu1","value2")

OR

val list = List("value1","value2")
df.filter(col("something").isin(list: _*)

In Java:

df.filter(col("something").isin(list.stream().toArray(String[]::new))))