view raw
Knows Not Much Knows Not Much - 9 months ago 58
Scala Question

Scala Parallel Collections: How to know and configure the number of threads

I am using scala parallel collections.

val largeList = => largeComputation(x)).toList

It is blazing fast, but I have a feeling that I may run into out-of-memory issues if we run too may "largeComputation" in parallel.

Therefore when testing, I would like to know how many threads is the parallel collection using and if-need-be, how can I configure the number of threads for the parallel collections.


Here is a piece of scaladoc where they explain how to change the task support and wrap inside it the ForkJoinPool. When you instantiate the ForkJoinPool you pass as the parameter desired parallelism level:

Here is a way to change the task support of a parallel collection:

import scala.collection.parallel._
val pc = mutable.ParArray(1, 2, 3)
pc.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(2))

So for your case it will be

val largeList = list.par
largerList.tasksupport = new ForkJoinTaskSupport(
  new scala.concurrent.forkjoin.ForkJoinPool(x)
) => largeComputation(x)).toList