indian_authority indian_authority - 4 months ago 64
Scala Question

Initialising HiveContext in Spark CLI

When Initialising Spark in Command-line interface by default SparkContext is initialised as sc and SQLContext as sqlContext.

But I need HiveContext as I am using a function

which is not supported by SparkContext, but is supported by HiveContext. Since HiveContext is a superclass of SparkContext ,it should have worked,but it isn't.

HOW DO I INITIALISE HiveContext in Scala using Spark CLI?


In spark-shell, sqlContext is an instance of HiveContext by default. You can read about that in my previous answer here.

Nevertheless, collect_list isn't available in spark 1.5.2. It was introduced in spark 1.6 so it's normal that you can find it.

Reference :

Also you don't need to import org.apache.spark.sql.functions._ in the shell. It's imported by default.