Sotos Sotos - 4 months ago 12
Scala Question

Apply function to Cartesian RDDs

I am trying to apply a function to cartesian RDDs. The function is taken from here and I have no idea how to make it work on cartesian RDDs.

val combined = rdd_valid.cartesian(rdd1)
combined.collect().foreach(a => println(a))

(afghr, decsvt)

My first thought was to do

val newRDD =

But it doesn't work.


Assuming combined has the type RDD[(String, String)], and Levenshtein.distance has this signature:

def distance(s1:String, s2:String)

You can apply it as follows:

val newRDD = { case (s1, s2) => Levenshtein.distance(s1, s2) }

Or, alternatively:

val newRDD = => Levenshtein.distance(t._1, t._2))