ShanZhengYang ShanZhengYang - 10 months ago 83
Python Question

How to deal with "divide by zero" with pandas dataframes when manipulating columns?

I'm working with hundreds of pandas dataframes. A typical dataframe is as follows:

import pandas as pd
import numpy as np
data = 'filename.csv'
df = pd.DataFrame(data)

one two three four five
a 0.469112 -0.282863 -1.509059 bar True
b 0.932424 1.224234 7.823421 bar False
c -1.135632 1.212112 -0.173215 bar False
d 0.232424 2.342112 0.982342 unbar True
e 0.119209 -1.044236 -0.861849 bar True
f -2.104569 -0.494929 1.071804 bar False

There are certain operations whereby I'm dividing between columns values, e.g.


However, there are times where I am dividing by zero, or perhaps both

df['one'] = 0
df['two'] = 0

Naturally, this outputs the error:

ZeroDivisionError: division by zero

I would prefer for 0/0 to actually mean "there's nothing here", as this is often what such a zero means in a dataframe.

(a) How would I code this to mean "divide by zero" is 0 ?

(b) How would I code this to "pass" if divide by zero is encountered?


Two approaches to consider:

Prepare your data so that never has a divide by zero situation, by explicitly coding a "no data" value and testing for that.

Wrap each division that might result in an error with a try/except pair, as described at (which has a divide by zero example to use)

(x,y) = (5,0)
  z = x/y
except ZeroDivisionError:
  print "divide by zero"

I worry about the situation where your data includes a zero that's really a zero (and not a missing value).