Yousuf Zaman Yousuf Zaman - 2 years ago 131
Scala Question

Spark Scala Dynamic column selection from DataFrame

I have a DataFrame which have different type of columns. Among those column, i need to retrieve specific column from that DataFrame.
Hard coded DataFrame select statement will be like this:

val logRegrDF ="LEBEL_COLUMN").as("label"),
col("FEATURE_COL1"), col("FEATURE_COL2"), col("FEATURE_COL3"), col("FEATURE_COL4"))

Where LEBEL_COLUMN and FEATURE_COLs will be dynamic.
I have Array or Seq for those FEATURE Columns like this:


I need to use this Array of column collection with that SELECT statement in the 2nd part.
In the select, 1st column will be one (LABEL_COLUMN) and rest will be dynamic list.

Can you please help me to make the select statement working in SCALA.

The sample code given bellow is working, but i need to add column array in the 2nd part of the SELECT

val colNames = => col(name))
val logRegrDF =*) // it is not the requirement

I am thinking for 2nd part code will be like this, but it is not working:

val logRegrDF ="LEBEL_COLUMN").as("label"), colNames:_*)

Answer Source

If I understand your question, I hope this is what you are looking for

val allColumnsArr = "LEBEL_COLUMN" +: FEATURE_COL_ARR"LEBEL_COLUMN", allColumnsArr: _*)
  .withColumnRenamed("LEBEL_COLUMN", "label")

Hope this helps!

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download