I'm stuck on particluar python question here. I have 2 dataframes DF1 and DF2. In both, I have 2 columns pID and yID (which are not indexed, just default). I'm look to add a column Found in DF1 where the respective values of columns (pID and yID) were found in DF2. Also, I would like to zone in on just values in DF2 where aID == 'Text'.
I believe the below gets me the 1st part of this question; however, I'm unsure how as to incorporate the where.
DF1['Found'] = (DF1[['pID', 'yID']] == DF2[['pID','yID']]).all(axis=1).astype(bool)
You could subset the second dataframe containing
aID == 'Text' to get a reduced
DF from which select those portions of columns to be compared against the first dataframe.
DF.isin() to check if the values that are present under these column names match or not. And,
True if both the columns happen to be
True, else they become
False. Convert the boolean series to integers via
astype(int) and assign the result to the new column,
df1_sub = df1[['pID', 'yID']] df2_sub = df2.query('aID=="Text"')[['pID', 'yID']] df1['Found'] = df1_sub.isin(df2_sub).all(axis=1).astype(int) df1
df1 = pd.DataFrame(dict(pID=[1,2,3,4,5], yID=[10,20,30,40,50])) df2 = pd.DataFrame(dict(pID=[1,2,8,4,5], yID=[10,12,30,40,50], aID=['Text','Best','Text','Best','Text']))