lte__ lte__ - 4 months ago 45
Java Question

Apache Spark - Adding two columns

Is there a way to add two columns so that the first one is a date and the second one contains the number of days that need to be added? I'm trying

date_add(res.col("date"), res.col("days"));


But this doesn't work, since
date_add()
requires a
Column
and an
int
, while I have two columns.

Thank you!

Answer

This is a limitation of DataFrame DSL not engine itself. It is not optimal but you can replace function call with expr:

import org.apache.spark.sql.functions.{expr, col}

val df = Seq(("2012-04-05", 6))
  .toDF("date", "days")
  .withColumn("date", col("date").cast("date"))

df.select(expr("date_add(date, days)"))