Carter Carter - 3 months ago 12
Scala Question

How to sum up every column of a Scala array?

If I have an array of array (similar to a matrix) in Scala, what's the efficient way to sum up each column of the matrix? For example, if my array of array is like below:

val arr = Array(Array(1, 100, ...), Array(2, 200, ...), Array(3, 300, ...))


and I want to sum up each column (e.g., sum up the first element of all sub-arrays, sum up the second element of all sub-arrays, etc.) and get a new array like below:

newArr = Array(6, 600, ...)


How can I do this efficiently in Spark Scala?

Answer

Using breeze Vector:

scala> val arr =  Array(Array(1, 100), Array(2, 200), Array(3, 300))
arr: Array[Array[Int]] = Array(Array(1, 100), Array(2, 200), Array(3, 300))

scala> arr.map(breeze.linalg.Vector(_)).reduce(_ + _)
res0: breeze.linalg.Vector[Int] = DenseVector(6, 600)

If your input is sparse you may consider using breeze.linalg.SparseVector.

Comments