Mnemosyne Mnemosyne - 1 year ago 78
Scala Question

How to query the presence of an element inside a Spark Dataframe Column that contains a set?

I have a spark dataframe where one column has the type

This column contains a set of string, for example
How do I filter the contents of the whole dataframe so that
I only get those rows that (for example) contain the value
in the set?

I'm looking for something similar to


the above shown example is only valid for when the content of column list is a string not a Set. What alternatives are there to fit my circumstances?

Edit: My question is not a duplicate. The user in that question has a set of values and wants to know which ones are located inside a specific column. I have a column that contains a set, and I want to know if a specific value is part of the set. My approach is the opposite of that.

Answer Source


import org.apache.spark.sql.functions.array_contains

dataframe.where(array_contains($"list", "eenie"))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download