Knows Not Much Knows Not Much - 1 year ago 102
Scala Question

join CassandraTableScanRDD[CassandraRow] with RDD[String]

I am writing a program where I have a RDD[String] and a CassandraTableScanRDD and i want to do a left join between them.

Is this possible? From what I saw online that joins were only happing between CassandraTableScanRDD.

Answer Source

join functions are available for PairRDD objects (see here).

A PairRDD object is an RDD of key-value pairs, for example: RDD[(Int, String)]

Typically you create a PairRDD object from a regular RDD using the keyBy function, which allows you to specify which key to use. Then when you run join, it joins elements whose keys are equal.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download