JiriS JiriS - 7 months ago 542
Java Question

Spark DataFrame and renaming multiple columns (Java)

Is there any nicer way to prefix or rename all or multiple columns at the same time of a given SparkSQL

DataFrame
than calling multiple times
dataFrame.withColumnRenamed()
?

An example would be if I want to detect changes (using full outer join). Then I'm left with two
DataFrame
s with the same structure.

Answer

I suggest to use the select() method to perform this. In fact withColumnRenamed() method uses select() by itself. Here is example how to rename multiple columns:

import org.apache.spark.sql.functions._

val someDataframe: DataFrame = ...

val initialColumnNames = Seq("a", "b", "c")
val renamedColumns = initialColumnNames.map(name => col(name).as(s"renamed_$name"))
someDataframe.select(renamedColumns : _*)