Himaprasoon Himaprasoon - 10 months ago 133
Scala Question

Scala Spark DataFrame : dataFrame.select multiple columns given a Sequence of column names

val columnName=Seq("col1","col2",....."coln");

Is there a way to do dataframe.select operation to get dataframe containing only the column names specified .
I know I can do

but the
is generated at runtime.
I could do
repeatedly for each column name in a loop.Will it have any performance overheads?. Is there any other simpler way to accomplish this?

Answer Source
val columnNames = Seq("col1","col2",....."coln")

// using the string column names:
val result = dataframe.select(columnNames.head, columnNames.tail: _*)

// or, equivalently, using Column objects:
val result = dataframe.select(columnNames.map(c => col(c)): _*)