abi_pat abi_pat - 1 year ago 87
Java Question

Task fails in Spark with ClassNotFoundException

I am trying to write a simple java program which will read data from Cassandra via Spark. I am doing this at POC level. My code looks like this

String keyspace = "newkspace1";
String tablename = "newtable5";
public static void main(String[] args)
SparkConf conf = new SparkConf();
conf.setAppName("Cassandra Demo");
conf.set("spark.cassandra.connection.host", "");
conf.set("spark.cassandra.connection.native.port", "9041");
conf.set("spark.cassandra.connection.rpc.port", "9160");
PerformerClass app = new PerformerClass(conf);
private void run()
JavaSparkContext sc = new JavaSparkContext(conf);
private void showResults(JavaSparkContext sc)
CassandraJavaPairRDD<Integer, Integer> rdd1 = javaFunctions(sc)
.cassandraTable(keyspace, tablename, mapColumnTo(Integer.class), mapColumnTo(Integer.class))
.select("keyval", "rangefield");

List<Integer> lst = rdd1.keys().toArray();
for(Integer l : lst)

When I ran the above code, I got the following exceptions (Stack trace pasted below)

15/01/15 19:22:41 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, ct-0094): java.lang.ClassNotFoundException: com.datastax.spark.connector.rdd.partitioner.CassandraPartition
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:340)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)

What am I missing?

Answer Source

I solved it myself by going through some blogs.

Actually the jar of the program must be included in the program itself like

JavaSparkContext sc = new JavaSparkContext(conf);

This solves the problem. But everytime before running your code, you have to Maven Install (that is create the jar) your code and then run it. I am still looking for the better approach where these steps can be avoided.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download