Skoky Skoky - 1 month ago 32
Scala Question

Spark scala Dataframe isin

I have a Spark Dataframe that contains Array[Byte]. Can I use

isin
for matching data against my Array[Byte]? If i try to use it like this:


clientIp.isin((whitelist:_*))


it does not match as the
whitelist:_*
does not format the byte array to
IN(...)
properly. Any idea how to fix this?

Answer

You can convert Array[Byte] to Java String, then you can match this with isin(whitelist:_*) if your white list List<String>

As per documentation, isin method accepts java.lang.object or Seq(java.lang.object)

https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/Column.html#isin(scala.collection.Seq)

Comments