Lobsterrrr Lobsterrrr - 3 months ago 15
Scala Question

Difference between fields of tail and head sub-lists of the list: java.lang.IndexOutOfBoundsException: 0

I have

RDD[(String,Map[String,List[Product with Serializable]])]
, such as:

(1566,Map(data1 -> List(List(1469785000, 111, 1, 3, null, 0),List(1469785022, 111, 1, 3, null, 0)), data2 -> List((4,88,1469775603,1,3370,f,537490800,661.09)))


I want to create a new RDD that will contain the time difference beween tail and head sub-lists of the list of
data1
(converted to minutes).

For example, in the data sample above this refers to
1469785022
-
1469785000
.

I wrote the following code, but it fails with the error
java.lang.IndexOutOfBoundsException: 0
. It seems that
tail
and
head
do not work as expected. How to solve this issue?

val newRDD = currentRDD.map({
line => Map(("id",line._1),
("duration", (line._2.get("data1").get.tail.productElement(0).toString.toLong -
line._2.get("data1").get.head.productElement(0).toString.toLong) / 60)
)
})

Answer

You have a problem of understanding of what is a head and what is a tail.

A List() object in Scala has a head and a tail. If you have a list like this:

scala> val list = List(10, 20, 30)

The structure of head and tail will be like this:

scala> list.head
res0: Int = 10
scala> list.tail
res1: List[Int] = List(20, 30)

So in your code when you use tail you are taking a List[Int] object. You can solve using tail.head like this:

line._2.get("data1").get.tail.head.productElement(0).toString.toLong
Comments