Vasco Vasco - 1 year ago 84
Scala Question

scala parallel collections not consistent

I am getting inconsistent answers from the following code which I find odd.

import scala.math.pow

val p = 2
val a = Array(1,2,3)

.aggregate("0")((x, y) => s"$y pow $p; ", (x, y) => x + y))

for (i <- 1 to 100) {
.aggregate(0.0)((x, y) => pow(y, p), (x, y) => x + y) == 14)
} => pow(x,p)).sum

In the code the
a.par ...
. Can anyone provide an explanation for why it is computing inconsistently?

Answer Source

In your "seqop" function, that is the first function you pass to aggregate, you define the logic that is used to combine elements within the same partition. Your function looks like this:

(x, y) => pow(y, p)

The problem is that you don't accumulate the results of a partition. Instead, you throw away your accumulator x. Every time you get 10 as a result, the calculation 2^2 was dropped.

If you change your function to take the accumulated value into account, you will get 14 every time:

(x, y) => x + pow(y, p)