staten12 staten12 - 1 month ago 11
Python Question

I need to apply multiple equations on a pandas column based on certain criteria

I have a dataframe which requires multiple equations based on certain criteria. I need to take the first 3 letters of an identifier, then, if it is True, I need to divide the value associated with that row by a certain amount.

The dataframe is as follows:

ID Value
US123 10000
US121 10000
MX122 10000
MX125 10000
BR123 10000
BR127 10000


I need to divide the value by 100 if the ID starts with 'MX', and divide the value by 1000 if the ID starts with 'BR'. All other values need to remain the same.

I also do not want to create a new filtered dataframe. I have had success filtering by ID then doing the logic checks, but I need to apply it over a much larger frame.

This is the code I am using for the filtered frame.

filtered['Value'] = np.where(filtered.ID.apply(lambda x: x[:3]).isin(['MX']) == True, filtered.Value/100, filtered.Value/1000)


I've also tried df.loc but I cannot figure out how to apply the changes to the dataframe, it seems to only show me a series of data but not apply it to the DF.

That code is here:

df.loc[(df['ID'].str.contains("MX") == True), 'Value']/100
df.loc[(df['ID'].str.contains("BR") == True), 'Value']/1000


Is there any better way to do this? How can I apply the changes using df.loc to the main dataframe rather than have it appear in a series?

The desired output should be:

ID Value
US123 10000
US121 10000
MX122 100
MX125 100
BR123 10
BR127 10


Thanks!

Answer

After computing the divided values using .loc, it must be re-assigned back to the DF used to make the selection as the operation is not inplace by itself.

Use str.startswith to check for string starting with a given pattern.

df.loc[df['ID'].str.startswith('MX'), 'Value'] /= 100
df.loc[df['ID'].str.startswith('BR'), 'Value'] /= 1000
df['Value'] = df['Value'].astype(int)
df

enter image description here