Mohit Bansal Mohit Bansal - 1 year ago 166
R Question

SparkR reading and writing dataframe issue

I have a Spark DataFrame which I want to write to my disc, I used the following code-


It got completed and I can see a new folder created with a
file in it.

Now when I am trying to read from the same file, using following code-


I am getting following error:

ERROR RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils
failed Error in invokeJava(isStatic = TRUE, className, methodName,
...) : org.apache.spark.sql.AnalysisException: Unable to infer
schema for ParquetFormat at dataframe.csv. It must be specified

I have even tried using repartition


Any help?

Answer Source

You also have to specify the source as "csv":

dataframe2<-read.df("dataframe_temp.csv", source="csv")

Regarding the header argument:

Currently there is also a bug in SparkR for Spark 2.0, where the variable arguments of the write.df function aren't passed to the options parameter (see That's why the header is not written to the csv even if you specify header="true" on write.df.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download