Sotos Sotos - 1 year ago 173
Scala Question

Get unique RDD strings

I have created the following sample RDD,

val rdd = sc.parallelize(List((""),

//I used the following to split,

val rdd1 ="@")) //RDD[Array[String]]

What I am trying to do now is to get a new RDD with distinct domains, i.e.

val finalrdd = sc.parallelize(List(("domainA"),

I found this post but I couldn't get it to work.

Answer Source

Try:"@")).flatMap { case Array(_, d) => d.split("\\.").headOption }.distinct
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download