Martin - 1 year ago 66
Python Question

# Multiplying column elements of sparse Matrix

I have a sparse csc matrix with many zero elements for which I would like to compute the product of all column elements for each row.

i.e.:

`````` A = [[1,2,0,0],
[2,0,3,0]]
``````

should be converted to:

``````V = [[2,
6]]
``````

Using a numpy dense matrix this can be accomplished by replacing all zero values with one values and using
`A.prod(1)`
. This is however not a option since the dense matrix would be too large.

Is there any way to accomplish this without converting the sparse matrix into a dense one?

Approach #1: We can use the row indices of the sparse elements as IDs and perform multiplication of the corresponding values of those elements with `np.multiply.reduceat` to get the desired output.

Thus, an implementation would be -

``````from scipy import sparse
from scipy.sparse import csc_matrix

r,c,v = sparse.find(a) # a is input sparse matrix
out = np.zeros(a.shape[0],dtype=a.dtype)
unqr, shift_idx = np.unique(r,return_index=1)
out[unqr] = np.multiply.reduceat(v, shift_idx)
``````

Sample run -

``````In [89]: # Let's create a sample csc_matrix
...: A = np.array([[-1,2,0,0],[0,0,0,0],[2,0,3,0],[4,5,6,0],[1,9,0,2]])
...: a = csc_matrix(A)
...:

In [90]: a
Out[90]:
<5x4 sparse matrix of type '<type 'numpy.int64'>'
with 10 stored elements in Compressed Sparse Column format>

In [91]: a.toarray()
Out[91]:
array([[-1,  2,  0,  0],
[ 0,  0,  0,  0],
[ 2,  0,  3,  0],
[ 4,  5,  6,  0],
[ 1,  9,  0,  2]])

In [92]: out
Out[92]: array([ -2,   0,   6, 120,   0,  18])
``````

Approach #2: We are performing bin-based multiplication. We have bin-based summing solution with `np.bincount`. So, a trick that could be use here would be converting the numbers to logarithmic numbers, perform bin-based summing and then convert back to original format with `exponential` (reverse of log) and that's it! For negative numbers, we might to add a step or more, but let's see what the implementation be like for non-negative numbers -

``````r,c,v = sparse.find(a)
out = np.exp(np.bincount(r,np.log(v),minlength = a.shape[0]))
out[np.setdiff1d(np.arange(a.shape[0]),r)] = 0
``````

A sample run with non-negative numbers -

``````In [118]: a.toarray()
Out[118]:
array([[1, 2, 0, 0],
[0, 0, 0, 0],
[2, 0, 3, 0],
[4, 5, 6, 0],
[1, 9, 0, 2]])

In [120]: out  # Using listed code
Out[120]: array([   2.,    0.,    6.,  120.,   18.])
``````