mattyd2 mattyd2 -4 years ago 129
Python Question

How to not match whole word "king" to "king?"?

How do I verify an exact word occurs in a string?

I need to account for cases when a word such as "king" has a question mark immediately following as in the example below.

unigrams this should be False

In [1]: answer = "king"
In [2]: context = "we run with the king? on sunday"


n_grams this should be False

In [1]: answer = "king tut"
In [2]: context = "we run with the king tut? on sunday"


unigrams this should be True

In [1]: answer = "king"
In [2]: context = "we run with the king on sunday"


n_grams this should be True

In [1]: answer = "king tut"
In [2]: context = "we run with the king tut on sunday"


As people mentioned, for the unigram case we can handle it by splitting the string into a list, but that doesn't work for n_grams.

After reading some posts, I think I should attempt to handle using a look behind, but I'm not sure.

Answer Source

Use a regular expression like this:

reg_answer = re.compile(r"(?<!\S)" + re.escape(answer) + r"(?!\S)")

See the Python demo

Details:

  • (?<!\S) - a negative lookbehind to ensure a match is preceded with whitespace or start of a string
  • re.escape(answer) - a preprocessing step to make all special chars inside the search word be treated as literal chars
  • (?!\S) - a negative lookahead to ensure the match is followed with whitespace or end of string.
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download