Marcus Renno Marcus Renno - 1 month ago 10
Python Question

Add column in DF if value modified of one column exists in DF

I'm trying to add one column in my dataframe (DF) according to another column value and whether that value is in my DF or not.


>>> d = { 'one' : pd.Series(['aa', 'bb', 'cc', 'aa-01', 'bb-02', 'dd']) }
>>> df = pd.DataFrame(d)
>>> df
0 aa
1 bb
2 cc
3 aa-01
4 bb-02
5 dd

I would like to add the following column if I can find another element with the current element appended -01 or -02.

Example: in this dataframe only the elements 'aa' and 'bb' have the elements with the appended value, which are 'aa-01', and 'bb-02', thus only 'aa' and 'bb' will have the value
in the new column

Expected result:

>>> expected_df
one two
0 aa True
1 bb True
2 cc False
3 aa-01 False
4 bb-02 False
5 dd False

I believe I have to use
, but I can't figure out a way to modify the row and use
at the same time within the function passed as argument to


You can create a boolean mask containing the conditions to keep. Followed by using isin after splitting on the char "-" from the elements selected after generating the mask and taking it's first part converted to a list.

mask = df['one'].str.contains('-01|-02')   # Can use df['one'].str.endswith(('-01','-02'))
df['two'] = df['one'].isin(df[mask].stack().str.split('-').str[0].tolist())

enter image description here

More robust approach:

mask = df['one'].str.endswith(('-01','-02'))
df['two'] = df['one'].isin(df[mask].squeeze().str[:-3])
print (df['two'])
0     True
1     True
2    False
3    False
4    False
5    False
Name: two, dtype: bool