Bear Brown Bear Brown - 27 days ago 8
Python Question

pandas exclude total if has details

has DF like this:

df = pd.DataFrame({'Art': [210, 211, 212, 310, 420, 421], 'Sum': [300, 120, 180, 250, 650, 650]})


in table view

Art Sum
0 210 300 # this is total
1 211 120 # children for index 0
2 212 180 # children for index 0
3 310 250 # !!! this is Not total
4 420 650 # this is total
5 421 650 # children for index 4


the total line is line where
Art
ends
0
but no children that start with the same two digits.

Art
210
has children : 211, 212

Art
310
not has children no line start with 31

Issue: need to remove total lines.

result need:

Art Sum
1 211 120
2 212 180
3 310 250 # !! this is Not total
5 421 650


how to do it?

Answer Source

You can index the Art column according to the first two digits and filter accordingly:

buckets = (df['Art'] // 10).value_counts()
df = df.loc[(df['Art'] // 10).isin(buckets.loc[buckets == 1].index) |
            (df['Art'] % 10 != 0)]

Which outputs:

   Art  Sum
1  211  120
2  212  180
3  310  250
5  421  650