luca - 3 months ago 17

Python Question

I have a DataFrame with some columns. I'd like to add a new column where each row value is the quantile rank of one existing column.

I can use DataFrame.rank to rank a column, but then I don't know how to get the quantile number of this ranked value and to add this quantile number as a new colunm.

Example: if this is my DataFrame

`df = pd.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]), columns=['a', 'b'])`

a b

0 1 1

1 2 10

2 3 100

3 4 100

and I'd like to know the quantile number (using 2 quantiles) of column b. I'd expect this result:

`a b quantile`

0 1 1 1

1 2 10 1

2 3 100 2

3 4 100 2

Answer

I discovered it is quite easy:

```
df['quantile'] = pd.qcut(df['b'], 2, labels=False)
a b quantile
0 1 1 0
1 2 10 0
2 3 100 1
3 4 100 1
```

Interesting to know "difference between pandas.qcut and pandas.cut"

Source (Stackoverflow)

Comments