mongolol mongolol - 1 month ago 7
Scala Question

Storing Result of SQL Query in RDD

Afternoon All,

I am attempting to call some Spark SQL on a SchemaRDD, and then the result stored in an RDD. The below line is producing the expected values, so I know the SQL is generating the correct table. Now I just need to store it.

sqlContext.sql("select encounter.Member_ID AS patientID, encounter.Encounter_DateTime AS date, diag.code from encounter join diag on encounter.Encounter_ID = diag.Encounter_ID").show(1)

p2. p2.
Answer

sqlContext.sql gives the DataFrame, you can call .rdd() to get the RDD[Row] .

You can try this:

 val queryResult = sqlContext.sql("select encounter.Member_ID AS patientID, encounter.Encounter_DateTime AS date, diag.code from encounter join diag on encounter.Encounter_ID = diag.Encounter_ID")

 val rdd: RDD[Row] = queryResult.rdd

Remove the show function on DataFrame since it displays the content of the DataFrame to stdout