Ethan Xu Ethan Xu - 5 days ago 4x
Scala Question

Cartesian product of values for each key

Given a paired RDD, how do I generate another RDD with the same key set, and Cartesian product of values (for each key) as new values?

Here is what I mean:

(K1, V1)
(K1, V2)
(K2, W1)
(K2, W2)

(K1, (V1, V1))
(K1, (V1, V2))
(K1, (V2, V2))
(K2, (W1, W1))
(K2, (W1, W2))
(K2, (W2, W2))
//Note (V2, V1) and (W2, W1) are not required, but having them in the result is not a big deal either.

Being new to Scala and Spark, I don't see an easy solution by using build-in transformations such as
. Am I missing some magic functions? Thanks a lot.


Just join thing with itself: