Knows Not Much Knows Not Much - 19 days ago 5
Scala Question

Executing for comprehension in parallel

I have written this code

def getParallelList[T](list : List[T]) : ParSeq[T] = {
val parList = list.par
parList.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(10))
parList
}

for {
a <- getList1
b <- getList2
c = b.calculateSomething
d <- getParallelList(getList3)
} { ... }


I want to know if this is a good (or best) way to make the for loop execute in parallel? Or should I explicitly code in futures inside of the loop.

I tested this and it seemed to work... but I am not sure if this is the best way ... also I am worried that what happens to the values of a,b,c for different threads of d. If one thread finishes earlier? does it change the value of a, b, c for others?

Answer

If getList3 is referentially transparent, i.e. it is going to return the same value every time it's called, it's better idea to calculate it once, since invoking .par on a list has to turn it to a ParVector, which takes O(n) (as List is a linked list and can't be immediately converted to a Vector structure). Here is example:

val list3 = getParallelList(getList3)
for {
  a <- getList1
  b <- getList2
  c = b.calculateSomething
  d <- list3
} { ... }

In the for comprehension, the values for (a, b, c) will remain the same during processing of d values.

For best performance, you might consider making getList1 or getList2 parallel, depending on how evenly work is split for a/b/c values.

Comments