Gavin Niu Gavin Niu - 22 days ago 9
Scala Question

Return Temporary Spark SQL Table in Scala

First I convert a CSV file to a Spark DataFrame using

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("/usr/people.csv")


after that type df and return I can see
res30: org.apache.spark.sql.DataFrame = [name: string, age: string, gender: string, deptID: string, salary: string]


Then I use
df.registerTempTable("people")
to convert df to a Spark SQL table.

But after that when I do
people
Instead got type table, I got
<console>:33: error: not found: value people
Is it because people is a temporary table? Thanks

Answer

When you register an temp table using the registerTempTable command you used, it will be available inside your SQLContext.

This means that the following is incorrect and will give you the error you are getting :

scala> people.show
<console>:33: error: not found: value people

To use the temp table, you'll need to call it with your sqlContext. Example :

scala> sqlContext.sql("select * from people")

Note : df.registerTempTable("df") will register a temporary table with name df correspond to the DataFrame df you apply the method on.

So persisting on df wont persist the table but the DataFrame, even thought the SQLContext will be using that DataFrame.