Erogol - 1 year ago 164

Python Question

Given a DataFrame I would like to compute number of zeros per each row. How can I compute it with Pandas?

This is presently what I ve done, this returns indices of zeros

`def is_blank(x):`

return x == 0

indexer = train_df.applymap(is_blank)

Answer Source

Use a boolean comparison which will produce a boolean df, we can then cast this to int, True becomes 1, False becomes 0 and then call `count`

and pass param `axis=1`

to count row-wise:

```
In [56]:
df = pd.DataFrame({'a':[1,0,0,1,3], 'b':[0,0,1,0,1], 'c':[0,0,0,0,0]})
df
Out[56]:
a b c
0 1 0 0
1 0 0 0
2 0 1 0
3 1 0 0
4 3 1 0
In [64]:
(df == 0).astype(int).sum(axis=1)
Out[64]:
0 2
1 3
2 2
3 2
4 1
dtype: int64
```

Breaking the above down:

```
In [65]:
(df == 0)
Out[65]:
a b c
0 False True True
1 True True True
2 True False True
3 False True True
4 False False True
In [66]:
(df == 0).astype(int)
Out[66]:
a b c
0 0 1 1
1 1 1 1
2 1 0 1
3 0 1 1
4 0 0 1
```

**EDIT**

as pointed out by david the `astype`

to `int`

is unnecessary as the `Boolean`

types will be upcasted to `int`

when calling `sum`

so this simplifies to:

```
(df == 0).sum(axis=1)
```