Swati Swati - 1 month ago 8
Scala Question

spark dataframe to pairedRDD in Scala

I am new to Spark and I want to convert dataframe to pairedRDD. My DataFrame looks like:

tagname,value,Minute
tag1,13.87,5
tag2,32.50,10
tag3,35.00,5
tag1,10.98,2
tag5,11.0,5


I want PairedRDD(tagname, value). I tried

val byKey:Map[String,Long] = winowFiveRDD.map({case (tagname,value) => (tagname)->value})


I am getting the following error:

error: constructor cannot be instantiated to expected type


Help is much appreciated. Thanks in Advance.

Answer Source

I'd use Dataset.as:

import org.apache.spark.rdd.RDD

val df = Seq(
  ("tag1", "13.87", "5"), ("tag2", "32.50", "10"), ("tag3", "35.00", "5"), 
  ("tag1", "10.98", "2"), ("tag5", "11.0", "5")
).toDF("tagname", "value", "minute")

val pairedRDD: RDD[(String, Double)] = df
  .select($"tagname", $"value".cast("double"))
  .as[(String, Double)].rdd