piRSquared - 1 year ago 66

Python Question

consider the the pandas series

`s`

`n = 1000`

s = pd.Series([0] * n + [1] * n, dtype=int)

s.memory_usage()

8080

I can "sparsify" this by using

`to_sparse`

`s.to_sparse(fill_value=0).memory_usage()`

4080

But I only have 2 types of integers. I'd think I could sparsify twice. Is there a way to do this?

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Since you tagged this with `scipy`

, I'll show you what a `scipy.sparse`

matrix is like:

```
In [31]: n=100
In [32]: arr=np.array([[0]*n+[1]*n],int)
In [33]: M=sparse.csr_matrix(arr)
In [34]: M.data
Out[34]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)
In [35]: M.indices
Out[35]:
array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125,
126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138,
139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151,
152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164,
165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177,
178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190,
191, 192, 193, 194, 195, 196, 197, 198, 199], dtype=int32)
In [36]: M.indptr
Out[36]: array([ 0, 100], dtype=int32)
```

It has replaced the `n`

elements of `arr`

with 2 arrays each with `n/2`

elements. Even if I replace the `int`

with `uint8`

, the `M.indices`

array will still be `int32`

.

The fact that your `pandas`

version has half the memory usage,suggests that it is just storing the indices, and some how noting that the `data`

part is all 1s. But that's just a guess.

How much greater sparification do you expect?

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**