Sotos Sotos - 6 months ago 82
Scala Question

Get unique RDD strings

I have created the following sample RDD,

val rdd = sc.parallelize(List((""),

//I used the following to split,

val rdd1 ="@")) //RDD[Array[String]]

What I am trying to do now is to get a new RDD with distinct domains, i.e.

val finalrdd = sc.parallelize(List(("domainA"),

I found this post but I couldn't get it to work.


Try:"@")).flatMap { case Array(_, d) => d.split("\\.").headOption }.distinct