vaylain vaylain - 4 months ago 8
Python Question

How to find an exact sequence of words in lists using Python 3?

I am coding in Python 3 on a Windows platform.

I am making a function that will pass in a user's inputted sentence which my function will then

.split()
and make it a list of each word that was in their original sentence.

My function will also pass in a predefined list of word patterns that my function will be watching for as a match on the exact sequence of words appearing in the user's sentence.

Now just so this is clear, I already can use
.intersection()
to find where the exact words are matches but I am looking for an exact sequence of words.

For instance if my user inputs:
"I love hairy cats"
, and the predefined list of key words is something like this:
["I love", "hairy cats", "I love cats", "love hair"]
, my function should only indicate
"I love"
and
"hairy cats"
as these two matched the specified sequence of words as they appeared in the predefined lists.

Here is my code thus far:

def parse_text(message, keywords):
newList = []
Message = message.split()
Keywords = keywords # Keywords need to be a list type
setMessage = set(word for word in Message)
setKeywords = set(word for word in Keywords)
newList = setMessage.intersection(setKeywords)

return newList


This works perfectly so far only if my keywords list contains only single words. My issue is when I try to make my list with multiple words to denote the sequence.

If my user's original message is:

message = "Hello world, yes and no"

keywords = ["help", "Hello", "yes", "so"] # this works, intersec "Hello" and "yes"

keywords = ["help me", "Hello mom", "yes and no", "so"] # this does not work, just returns empty "set()"


Any ideas of how I can make adjustments to my function to check my user's original sentence for a specific sequence of words as they appear my keyword list?

Answer

Why use sets at all? This is a pretty straightforward string operation:

def parse_text(message, keywords):
     newList = []
     for keyword in keywords:
         if keyword in message:
             newList.append(keyword)
     return newList

or, using list comprehensions for more succinctness:

def parse_text(message, keywords):
    return [keyword for keyword in keywords if keyword in message]