Mannaggia Mannaggia - 1 month ago 22
Python Question

what's the inverse of the quantile function on a pandas Series?

The quantile functions gives us the quantile of a given pandas series s,

E.g.


s.quantile(0.9) is 4.2


Is there the inverse function (i.e. cumulative distribution) which finds the value x such that


s.quantile(x)=4


Thanks

Answer

There's no 1-liner that I know of, but you can achieve this with scipy:

import pandas as pd
import numpy as np
from scipy.interpolate import interp1d

# set up a sample dataframe
df = pd.DataFrame(np.random.uniform(0,1,(11)), columns=['a'])
# sort it by the desired series and caculate the percentile
sdf = df.sort('a').reset_index()
sdf['b'] = sdf.index / float(len(sdf) - 1)
# setup the interpolator using the value as the index
interp = interp1d(sdf['a'], sdf['b'])

# a is the value, b is the percentile
>>> sdf
    index         a    b
0      10  0.030469  0.0
1       3  0.144445  0.1
2       4  0.304763  0.2
3       1  0.359589  0.3
4       7  0.385524  0.4
5       5  0.538959  0.5
6       8  0.642845  0.6
7       6  0.667710  0.7
8       9  0.733504  0.8
9       2  0.905646  0.9
10      0  0.961936  1.0

Now we can see that the two functions are inverses of each other.

>>> df['a'].quantile(0.57)
0.61167933268395969
>>> interp(0.61167933268395969)
array(0.57)
>>> interp(df['a'].quantile(0.43))
array(0.43)

interp can also take in list, a numpy array, or a pandas data series, any iterator really!