Rahul shrivastava - 1 year ago 74

Python Question

My table looks like this:

`In [82]:df.head()`

Out[82]:

MatDoc MatYr MvT Material Plnt SLoc Batch Customer AmountLC Amount ... PO MatYr.1 MatDoc.1 Order ProfitCtr SLED/BBD PstngDate EntryDate Time Username

0 4912693062 2015 551 100062 HDC2 0001 5G30MC1A11 NaN 9.03 9.06 ... NaN NaN NaN NaN IN1165B085 26.01.2016 01.08.2015 01.08.2015 01:13:16 O33462

1 4912693063 2015 501 166 HDC2 0004 NaN NaN 0.00 0.00 ... NaN NaN NaN NaN IN1165B085 NaN 01.08.2015 01.08.2015 01:13:17 O33462

2 4912693320 2015 551 101343 HDC2 0001 5G28MC1A11 NaN 53.73 53.72 ... NaN NaN NaN NaN IN1165B085 25.01.2016 01.08.2015 01.08.2015 01:16:30 O33462

Here, I need to group by data on

`Order`

`AmountLC`

`Order`

`MvT101group`

`MvT102group`

`Order`

`MvT102group`

`MvT101group`

`Order|Plnt|Material|Batch|Sum101=SumofMvt101ofAmountLC|Sum102=SumofMvt102ofAmountLC|(Sum101-Sum102)/100`

What I have done is first I made new df containing only 101 and 102:

`Mvt101`

`MvT102`

`MvT101 = df.loc[df['MvT'] == 101]`

`MvT102 = df.loc[df['MvT'] == 102]`

Then I grouped it by

`Order`

`MvT101group = MvT101.groupby('Order', sort=True)`

`In [76]:`

MvT101group[['AmountLC']].sum()

Out[76]:

Order AmountLC

1127828 16348566.88

1127829 22237710.38

1127830 29803745.65

1127831 30621381.06

1127832 33926352.51

`MvT102group = MvT102.groupby('Order', sort=True)`

`In [77]:`

MvT102group[['AmountLC']].sum()

Out[77]:

Order AmountLC

1127830 53221.70

1127831 651475.13

1127834 67442.16

1127835 2477494.17

1128622 218743.14

After this I am not able to understand how should I write my query.

Please ask me any further details if you want.Here is the CSV file from where I am working Link

Answer Source

Hope I understood the question correctly. After grouping both groups as you did:

```
MvT101group = MvT101.groupby('Order',sort=True).sum()
MvT102group = MvT102.groupby('Order',sort=True).sum()
```

You can update the columns' names for both groups:

```
MvT101group.columns = MvT101group.columns.map(lambda x: str(x) + '_101')
MvT102group.columns = MvT102group.columns.map(lambda x: str(x) + '_102')
```

Then merge all 3 tables so that you will have all 3 columns in the main table:

```
df = df.merge(MvT101group, left_on=['Order'], right_index=True, how='left')
df = df.merge(MvT102group, left_on=['Order'], right_index=True, how='left')
```

And then you can add the calculated column:

```
df['calc'] = (df['Order_101']-df['Order_102']) / 100
```