user3471881 user3471881 - 1 month ago 7
Python Question

Add 2nd column when column matches in str.contains

I want to search through the

searchList
and check if column
text
str.contains
one or more of each
searchWord
. If I get a match I want to append the data to
masterdf
which is easily accomplished as seen below. But I also want to add a new column with
searchWord
so that I know which
text
matched with what. This code below fills the column
searchWord
with the latest search that matched.

masterdf = pd.DataFrame(columns=['doc_id','text',])

for searchWord in searchList:
search = jsons_data[jsons_data['text'].str.contains(searchWord)]
if len(search) > 0:
masterdf = masterdf.append(search)
masterdf['searchWord'] = searchWord

Answer

I think this is what you are after.

Let's setup up example data:

tt = '''I want to search through the. searchList and check if column text str.contains one or more of each searchWord. If I get a match I want to append the data to masterdf which is easily accomplished as seen below. But I also want to add a new column with searchWord so that I know which text matched with what. This code below fills the column searchWord with the. latest search that matched'''
text_col = tt.split('.')
id_col = range(len(text_col))
jsons_data = pd.DataFrame({'doc_id':id_col,'text':text_col})

searchList = ['code','fills', 'But','also','want']

The example jsons_data is

    doc_id  text
0   0       I want to search through the
1   1       searchList and check if column text str
2   2       contains one or more of each searchWord
3   3       If I get a match I want to append the data to...
4   4       But I also want to add a new column with sear...
5   5       This code below fills the column searchWord w...
6   6       latest search that matched

Modifying your code with search['searchWord'] = searchWord we get:

masterdf = pd.DataFrame(columns=['doc_id','text','searchWord'])

for searchWord in searchList:
    search = jsons_data[jsons_data['text'].str.contains(searchWord)]
    if len(search) > 0:
        search['searchWord'] = searchWord
        masterdf = masterdf.append(search)

And masterdf is

doc_id  text                                                searchWord
5   5.0 This code below fills the column searchWord w...    code
5   5.0 This code below fills the column searchWord w...    fills
4   4.0 But I also want to add a new column with sear...    But
4   4.0 But I also want to add a new column with sear...    also
0   0.0 I want to search through the                        want
3   3.0 If I get a match I want to append the data to...    want
4   4.0 But I also want to add a new column with sear...    want
Comments