rp3105 rp3105 - 29 days ago 8
Scala Question

split line in scala keeping first element of line common

I am trying to split my data file in a following way.

1 1#1097#2321#2018
2 12#312#123#1211


So I want the resulting RDD to be:

1 1
1 1097
1 2321
1 2018
2 12
2 312
2 123

Answer

Assuming you already have your lines as an RDD (and there are no possible errors in input, on which I wouldn't count, so you may add some pre-validation/filtering):

lines.flatMap { case line =>
  val Array(head, other) = line.split(" ")
  other.split('#').map(o => head -> o)
}
Comments