Rohan Nayak Rohan Nayak - 2 years ago 84
Scala Question

What is the meaning for reduceByKey(_ ++ _)

Recently I had scenario to store the the data in keyValue Pair and came across a function

reduceByKey(_ ++ _)
. This is more of shorthand syntax. I am not able to understand what this actually means.

reduceBykey(_ + _)

reduceByKey(_ ++ _)
means ??

I am able to create Key value pair out of data using
reduceByKey(_ ++ _)

val y = sc.textFile("file:///root/My_Spark_learning/reduced.txt")>value.split(","))

.reduceByKey(_ ++ _)

(1,List(2, 3, 3, 4))
(4,List(5, 6))
(7,List(8, 9))

Answer Source

reduceByKey(_ ++ _) translates to reduceByKey((a,b) => a ++ b).

++ is a method defined on List that concatenates another list to it.

So, for key 1 in the sample data, a will be List(2,3) and b will be List(3,4) and hence the concatenation of List(2,3) and List(3,4) (List(2,3) ++ List(3,4)) would yield List(2,3,3,4).

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download