Alexey Grigorev Alexey Grigorev - 2 months ago 25
Scala Question

zipWithIndex on Apache Flink

I'd like to assign each row of my input an

id
- which should be a number from
0
to
N - 1
, where
N
is the number of rows in the input.

Roughly, I'd like to be able to do something like the following :

val data = sc.textFile(textFilePath, numPartitions)
val rdd = data.map(line => process(line))
val rddMatrixLike = rdd.zipWithIndex.map { case (v, idx) => someStuffWithIndex(idx, v) }


But in Apache Flink. Is it possible?

Answer

This is now a part of the 0.10-SNAPSHOT release of Apache Flink. Examples for zipWithIndex(in) and zipWithUniqueId(in) are available in the official Flink documentation.