user568109 user568109 - 3 months ago 63
Scala Question

How to convert rdd object to dataframe in spark

How can I convert an RDD (

org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
) to a Dataframe
org.apache.spark.sql.DataFrame
. I converted a dataframe to rdd using
.rdd
. After processing it I want it back in dataframe. How can I do this ?

Answer

SqlContext has a number of createDataFrame methods that create a DataFrame given an RDD. I imagine one of these will work for your context.

For example:

def createDataFrame(rowRDD: RDD[Row], schema: StructType): DataFrame

Creates a DataFrame from an RDD containing Rows using the given schema.