user6146748 user6146748 - 1 month ago 20
Python Question

Conditional statement and split in a Dataframe

I am looking for a conditional statement in python to look for a certain information in a specified column and put the results in a new column

Here is an example of my dataset:

OBJECTID CODE_LITH
1 M4,BO
2 M4,BO
3 M4,BO
4 M1,HP-M7,HP-M1


and what I want as results:

OBJECTID CODE_LITH M4 M1
1 M4,BO 1 0
2 M4,BO 1 0
3 M4,BO 1 0
4 M1,HP-M7,HP-M1 0 1


What I have done so far:

import pandas as pd
import numpy as np
lookup = ['M4']
df.loc[df['CODE_LITH'].str.isin(lookup),'M4'] = 1
df.loc[~df['CODE_LITH'].str.isin(lookup),'M4'] = 0


Since there is multiple variables per rows in "CODE_LITH" it seems like the script in not able to find only "M4" it can find "M4,BO" and put 1 or 0 in the new column

I have also tried:

if ('M4') in df['CODE_LITH']:
df['M4'] = 0
else:
df['M4'] = 1


With the same results.

Thanks for your help.

PS. The dataframe contains about 2.6 millions rows and I need to do this operation for 30-50 variables.

Answer

I was able to do:

for index,data in enumerate(df['CODE_LITH']):
    if "I1" in data:
        df['Plut_Felsic'][index] = 1
    else:
        df['Plut_Felsic'][index] = 0

It does work, but takes quite some time to calculate.