Boern Boern - 1 year ago 230
Scala Question

Spark & Scala: Read in .CSV as DataFrame

coming from the

world I want to import an .csv into Spark (v.1.6.1) using the Scala Shell (

My .csv has a header and looks like



Answer Source

Spark 2.0+

Since the databricks/spark-csv has been integrated into Spark, reading .CSVs is pretty straight forward

val df ="header", true).csv(path)

Older versions

After restarting my spark-shell I figured it out by myself - may be of help for others:

After installing like described here and starting the spark-shell using ./spark-shell --packages com.databricks:spark-csv_2.11:1.4.0:

scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
scala> val df ="com.databricks.spark.csv")
    .option("header", "true")
    .option("inferSchema", "true")
scala> df.printSchema()
 |-- col1: double (nullable = true)
 |-- col2: string (nullable = true)
 |-- col3: integer (nullable = true)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download