JKC JKC - 28 days ago 8
Scala Question

How to enclose the List items within double quotes in Apache Spark

I have a String variable containing few column names separated by comma. For example :

val temp = "Col2, Col3, Col4"

I have a Dataframe and I want to group the Dataframe based on certain columns which include the columns stored in temp variable as well. For example my groupBy statement should act like the following statement

DF.groupBy("Col1", "Col2", "Col3", "Col4")

The temp variable may have any column names. So i want to create a GroupBy statement that gets the value of temp variable dynamically along with manual entries provided by me.

I tried with the following statement but to no avail
DF.groupBy("Col1", temp)

Then I splitted the value of temp variable based on comma sign and stored them in another variable and tried to pass it to the groupBy statement. But even that fails.

val temp1 = temp.split(",")

DF.groupBy("Col1", temp1)

Any ideas how I can enclose the values of a List variable within double quotes and pass the same to a groupBy statement ?

Answer Source

Use varargs:

df.groupBy("Col1", temp1: _*)


import org.apache.spark.sql.functions.col

df.groupBy("Col1 +: temp1 map col: _*)