Christian Zichichi Christian Zichichi - 25 days ago 17
Scala Question

Manipulating Vectors and Lists in RDDs

I'm new to Spark and Scala and I really need some help with the following RDD transformation:

INPUT
(macAddress,Vector(List(ts1,ts2),List(ts2,ts3),List.....)

(c8:3a:bv:b1:3a:e0,Vector(List(1472820071, 1472821088), List(1472821088, 1472821429), List(1472821429, 1472824217)))


DESIDED OUTPUT (macAddress,Vector(intvalue,intvalue,...))

(c8:3a:bv:b1:3a:e0,Vector(1472821088-1472820071, 1472821429-1472821088,1472824217-1472821429))


In short, I have an rdd already grouped by key (macAddress) containing Paired Lists of values. I need to transform the Vector of Lists into a Vector containing the paired differences computed from the Lists (secondElement-firstElement). The number of paired Lists in Vector is variable in RDD (depends from the macAddress considered)

I don't know which transformation I have to use in this case.

Thanks

Answer

Make updates based on your datatype

  def flattenRDDElements(x:(macAddress,Vector[List[Int]]) ) : (macAddress,Vector[String]) = {
    x match {
      case (s,y) => (s,y.map(switchListElements))
    }
  }

  def switchListElements(x: List[Int]):String = x match {
    case a::b::Nil => b+"-"+a
  }

  rdd.map(r => flattenRDDElements(r))
Comments