Martin - 1 month ago 12

Python Question

I have a sparse csc matrix with many zero elements for which I would like to compute the product of all column elements for each row.

i.e.:

`A = [[1,2,0,0],`

[2,0,3,0]]

should be converted to:

`V = [[2,`

6]]

Using a numpy dense matrix this can be accomplished by replacing all zero values with one values and using

`A.prod(1)`

Is there any way to accomplish this without converting the sparse matrix into a dense one?

Answer

**Approach #1:** We can use the row indices of the sparse elements as IDs and perform multiplication of the corresponding values of those elements with `np.multiply.reduceat`

to get the desired output.

Thus, an implementation would be -

```
from scipy import sparse
from scipy.sparse import csc_matrix
r,c,v = sparse.find(a) # a is input sparse matrix
out = np.zeros(a.shape[0],dtype=a.dtype)
unqr, shift_idx = np.unique(r,return_index=1)
out[unqr] = np.multiply.reduceat(v, shift_idx)
```

Sample run -

```
In [89]: # Let's create a sample csc_matrix
...: A = np.array([[-1,2,0,0],[0,0,0,0],[2,0,3,0],[4,5,6,0],[1,9,0,2]])
...: a = csc_matrix(A)
...:
In [90]: a
Out[90]:
<5x4 sparse matrix of type '<type 'numpy.int64'>'
with 10 stored elements in Compressed Sparse Column format>
In [91]: a.toarray()
Out[91]:
array([[-1, 2, 0, 0],
[ 0, 0, 0, 0],
[ 2, 0, 3, 0],
[ 4, 5, 6, 0],
[ 1, 9, 0, 2]])
In [92]: out
Out[92]: array([ -2, 0, 6, 120, 0, 18])
```

**Approach #2:** We are performing bin-based multiplication. We have bin-based summing solution with `np.bincount`

. So, a trick that could be use here would be converting the numbers to logarithmic numbers, perform bin-based summing and then convert back to original format with `exponential`

(reverse of log) and that's it! For negative numbers, we might to add a step or more, but let's see what the implementation be like for non-negative numbers -

```
r,c,v = sparse.find(a)
out = np.exp(np.bincount(r,np.log(v),minlength = a.shape[0]))
out[np.setdiff1d(np.arange(a.shape[0]),r)] = 0
```

A sample run with non-negative numbers -

```
In [118]: a.toarray()
Out[118]:
array([[1, 2, 0, 0],
[0, 0, 0, 0],
[2, 0, 3, 0],
[4, 5, 6, 0],
[1, 9, 0, 2]])
In [120]: out # Using listed code
Out[120]: array([ 2., 0., 6., 120., 18.])
```