Devi Devi - 1 year ago 155
Scala Question

Getting connection error while reading data from ElasticSearch using apache Spark & Scala

I gave the following code

val conf = new org.apache.spark.SparkConf()
.set("es.nodes", "")

val sc = new org.apache.spark.SparkContext(conf)
val resource = "index/data"
val count = sc.esRDD(resource).count()


elastic search version=1.5.2
spark version=1.5.2
Scala version=2.10.4

and given library dependency as follows,

libraryDependencies += "org.elasticsearch" % "elasticsearch-spark_2.10" % "2.1.3"

I am getting following error while running the program

Exception in thread "main" Connection error (check network and/or proxy settings)- all nodes failed

How can I read data from elastic search using spark and Scala?

Answer Source

Please look at the option "es.nodes.wan.only". By default, the value for this key is set to "false", and when I set it to true, that exception went away. Here is the current documentation for the configuration values:

val conf = new org.apache.spark.SparkConf()
 .set("es.nodes", "")
 .set("es.nodes.wan.only", "true")

Note that the doc specifies to flip this value to true for environments like those on AWS, but this exception happened for me when attempting to point to a VM with Elasticsearch running.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download