Boris Boris - 28 days ago 34
Scala Question

How to use DataFrame filter with isin in Spark Java?

I'm trying to filter a Spark DataFrame using a list in Java.

java.util.List<Long> selected = ....;
DataFrame result = df.filter(df.col("something").isin(????));


The problem is that isin(...) method accepts Scala Seq or Scala varang.

Passing in JavaConversions.asScalaBuffer(selected) doesn't work either.

Any ideas?

Answer

you can use something like this.

df.filter(col("something").isin("valu1","value2")

OR

val list = List("value1","value2")
df.filter(col("something").isin(list: _*)

In Java:

df.filter(col("something").isin(list.stream().toArray(String[]::new))))
Comments