I'm trying to create a regex that matches a third person form of a verb created using the following rule:
If the verb ends in e not preceded by i,o,s,x,z,ch,sh, add s.
You may use
See the regex demo
re does not support variable length alternatives in a lookbehind, you need to split the conditions into two lookbehinds here.
\b- a leading word boundary
(?=\w*(?<![iosxz])(?<![cs]h)es\b)- a positive lookahead requiring a sequence of:
\w*- 0+ word chars
(?<![iosxz])- there must not be
zchars right before the current location and...
shright before the current location...
es- followed with
\b- at the end of the word
\w*- zero or more (maybe
+is better here to match 1 or more) word chars.
See Python demo:
import re r = re.compile(r'\b(?=\w*(?<![iosxz])(?<![cs]h)es\b)\w*') s = 'it matches "likes", "hates" etc. However, it does not match "bathes", why doesn\'t it?' print(re.findall(r, s))