ahajib - 1 year ago 220

Python Question

I have a data frame in pandas in which each column has different value range. For example:

df:

`A B C`

1000 10 0.5

765 5 0.35

800 7 0.09

Any idea how I can normalize the columns of this data frame where each value is between 0 and 1?

My desired output is:

`A B C`

1 1 1

0.765 0.5 0.7

0.8 0.7 0.18(which is 0.09/0.5)

Answer Source

You can use the package sklearn and its associated preprocessing utilities to normalize the data.

```
from sklearn import preprocessing
x = df.values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df = pandas.DataFrame(x_scaled)
```

For more information look at the documentation: http://scikit-learn.org/stable/modules/preprocessing.html#scaling-features-to-a-range