Stanley Stanley - 1 year ago 99
Scala Question

Spark Shell Import Fine, But Throws Error When Referencing Classes

I am a beginner in Apache Spark, so please excuse me if this is quite trivial.

Basically, I was running the following import in


import org.apache.spark.sql.{DataFrame, Row, SQLContext, DataFrameReader}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql._
import org.apa‌​;

val rdd = sc.hadoopFile(path,

The import statements up till OrcInputFormat works fine, with the exception that:

error: object apa‌​che is not a member of package org
import org.apa‌​;

It does not make sense, if the import statement before goes through without any issue.

In addition, when referencing
, I was told:

error: type OrcInputFor‌​mat is not a member of package

It seems strange that import for
to work (I assume it works, since no error is thrown), but then the above error message turns up. Basically, I am trying to read ORC files from S3.

I am also looking at what have I done wrong, and why this happens.

What I have done:

  1. I have tried running
    with the
    option, and tried importing
    (My current version of Spark is 1.6.1, compiled with Hadoop 2.6)

  2. val df ="orc").load(PathToS3)
    , as referred by (Read ORC files directly from Spark shell). I have tried variations of S3, S3n, S3a, without any success.

Answer Source

You have 2 non-printing characters between org.ape and che in the last import, most certainly due to a copy paste :

import org.apa‌​;

Just rewrite the last import statement and it will work. Also you don't need these semi-colons.

You have the same problem with OrcInputFormat :

error: type OrcInputFor‌​mat is not member of package

That's funny, in the mobile version of Stackoverflow we can cleary see those non-printing characters :

enter image description here

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download