Dheeraj Dheeraj - 3 days ago 6
Python Question

Easy pythonic way to classify cloumns in groups and store it in Dictionary?

Machine_number Machine_Running_Hours
0 1.0 424.0
1 2.0 458.0
2 3.0 465.0
3 4.0 446.0
4 5.0 466.0
5 6.0 466.0
6 7.0 445.0
7 8.0 466.0
8 9.0 447.0
9 10.0 469.0
10 11.0 467.0
11 12.0 449.0
12 13.0 436.0
13 14.0 465.0
14 15.0 463.0
15 16.0 372.0
16 17.0 460.0
17 18.0 450.0
18 19.0 467.0
19 20.0 463.0
20 21.0 205.0


I am trying to classify according to machine number. Like Machine_number 1 to 5 will be one group. Then 6 to 10 in one group and so on.

I am new to python. So i need some basic help. Thanks.

Answer

I think you need substract 1 by sub and then floordiv:

df['g'] = df.Machine_number.sub(1).floordiv(5)
#same as //
#df['g'] = df.Machine_number.sub(1) // 5
print (df)
    Machine_number  Machine_Running_Hours    g
0              1.0                  424.0 -0.0
1              2.0                  458.0  0.0
2              3.0                  465.0  0.0
3              4.0                  446.0  0.0
4              5.0                  466.0  0.0
5              6.0                  466.0  1.0
6              7.0                  445.0  1.0
7              8.0                  466.0  1.0
8              9.0                  447.0  1.0
9             10.0                  469.0  1.0
10            11.0                  467.0  2.0
11            12.0                  449.0  2.0
12            13.0                  436.0  2.0
13            14.0                  465.0  2.0
14            15.0                  463.0  2.0
15            16.0                  372.0  3.0
16            17.0                  460.0  3.0
17            18.0                  450.0  3.0
18            19.0                  467.0  3.0
19            20.0                  463.0  3.0
20            21.0                  205.0  4.0

If need store in dictionary use groupby with dict comprehension:

dfs = {i:g for i, g in df.groupby(df.Machine_number.astype(int).sub(1).floordiv(5))}
print (dfs)
{0:    Machine_number  Machine_Running_Hours
0             1.0                  424.0
1             2.0                  458.0
2             3.0                  465.0
3             4.0                  446.0
4             5.0                  466.0, 1:    Machine_number  Machine_Running_Hours
5             6.0                  466.0
6             7.0                  445.0
7             8.0                  466.0
8             9.0                  447.0
9            10.0                  469.0, 2:     Machine_number  Machine_Running_Hours
10            11.0                  467.0
11            12.0                  449.0
12            13.0                  436.0
13            14.0                  465.0
14            15.0                  463.0, 3:     Machine_number  Machine_Running_Hours
15            16.0                  372.0
16            17.0                  460.0
17            18.0                  450.0
18            19.0                  467.0
19            20.0                  463.0, 4:     Machine_number  Machine_Running_Hours
20            21.0                  205.0}
print (dfs[0])
   Machine_number  Machine_Running_Hours
0             1.0                  424.0
1             2.0                  458.0
2             3.0                  465.0
3             4.0                  446.0
4             5.0                  466.0

print (dfs[1])
   Machine_number  Machine_Running_Hours
5             6.0                  466.0
6             7.0                  445.0
7             8.0                  466.0
8             9.0                  447.0
9            10.0                  469.0
Comments