dreddy dreddy - 18 days ago 10
Scala Question

postgres bool_or equivalent in spark

I'm trying to convert a postgres query to spark

select
bool_or(case when col_1 is null then false else true end),
bool_or(col_2)
from fct_table


Below is the dataframe I am trying to work with that has col_1 and col_2:

val df = spark.table("fct_table")
df.agg(
bool_or(when(col("col_1") isNull,false).otherwise(true))
bool_or("col_2")
)


I'm doing this in scala and bool_or is not an aggregate function.
Any help is appreciated.

Answer

With sum:

import org.apache.spark.sql.Column

def bool_or(expr: Column) = sum(expr.cast("integer")) > 0

With count

def bool_or(expr: Column) = count(when(expr, lit(1))) > 0