Chris Chris - 5 months ago 54
Python Question

Can re.fullmatch() eliminate the need for string anchors in regex

Consider the following regex, which checks for password strength. It has the start and end string anchors, to ensure it's matching the entire string.

pattern = re.compile(r'^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[$@$!%*#?&.])[A-Za-z\d$@$!%*#?&.]{8,}$')
while True:
user_pass = input('Enter a secure password: ')
if re.fullmatch(pattern, user_pass):
print('Successfully changed password')
break
else:
print('Not secure enough. Ensure pass is 8 characters long with at least one upper and lowercase letter, number,'
' and special character.')


I noticed Python 3.5 has a re.fullmatch() which appears to do the same thing, but without the string anchors:

pattern = re.compile(r'(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[$@$!%*#?&.])[A-Za-z\d$@$!%*#?&.]{8,}')
while True:
user_pass = input('Enter a secure password: ')
if re.fullmatch(pattern, user_pass):
print('Successfully changed password')
break
else:
print('Not secure enough. Ensure pass is 8 characters long with at least one upper and lowercase letter, number,'
' and special character.')


Is this the intended purpose of fullmatch? Are there any situations where this could cause unintended issues?

Answer

The fullmatch() function and regex.fullmatch() method are new in Python 3.4.

The changelog is very explicit about it:

This provides a way to be explicit about the goal of the match, which avoids a class of subtle bugs where $ characters get lost during code changes or the addition of alternatives to an existing regular expression.

So, the way you use it is indeed the intended purpose of this feature. It can not lead to unexpected issue, ^ and $ are just carefully added internally.

Comments