Haji Akhundov Haji Akhundov - 1 month ago 19
Scala Question

Mapping in Spark Scala

I am new to Spark and Scala and to this kind of programming in general.

What I want to accomplish is the following:

I have an RDD that is org.apache.spark.rdd.RDD**[(Double, Iterable[String])]**

So the possible content could be:

<1 , (A,B,C)>
<42, (A) >
<0 , (C,D) >


I need to transform this to a new RDD in such way so I get a similar output to:

<1, A>
<1, B>
<1, C>
<42, A>
<0, C>
<0, D>


This has to be very simple, but I tried so many different ways and couldn't get it right.

Answer

You can use flatMapValues:

import org.apache.spark.SparkContext._

val r : RDD[(Double, Iterable[String])] = ...
r.flatMapValues(x => x)