alphanumeric alphanumeric - 1 month ago 17
Python Question

How to classify the numbers by value in DataFrame

With:

import pandas as pd
df = pd.DataFrame({'a':[1,2,3,4,5,12,14,121,131,298,299,1001]})
print df.a.mean()


returns an average of all the numbers:

157.583333333


Half of the numbers are smaller than 100. I wonder if there is a way to break the numbers into the categories (essentially classifying them). I would specify the number of groups to classify the numbers into and the function would return a list where each number is replaced with the corresponding category's index. So the numbers smaller then 100 would be given an integer category 1. Then the numbers from 100 - 200 would be given a category 2 and etc. Essentially some kind of rounding function that would round the numbers to that all into the range of values: from 0 to 100, from 100.1 to 200.0 and etc

Answer
import pandas as pd     
df = pd.DataFrame({'a':[1,2,3,4,5,12,14,121,131,298,299,1001]})
df['category'] = df['a'] // 100 + 1
print(df[['a', 'category']])

       a  category
0      1         1
1      2         1
2      3         1
3      4         1
4      5         1
5     12         1
6     14         1
7    121         2
8    131         2
9    298         3
10   299         3
11  1001        11