Paradoxis Paradoxis - 8 months ago 58
Python Question

Python re.compile curly quotes issue

I'm currently writing an application that uses a framework to match certain phrases, currently it is supposed to match the following regex pattern:

Say \"(.*)\"

However, I've notices that my users are complaining about the fact that their OS sometimes copies and pastes 'curly quotes' in, what ends up happening is that users provide the following sentence:

Say "Hello world!" <-- Matches
Say “Hello world!” <-- Doesn't match!

Is there any way I can tell Python's regular expressions to treat these curly quotes the same as regular quotes?


You could just include the curly quotes in the regex:

Say [\"“”](.*)[\"“”]

As something you can replicate in the Python repl, it's like this:

>>> import re
>>> test_str = r'"Hello"'
>>> reg = r'["“”](.*)["“”]'
>>> m =, test_str)
>>> test_str = r'“Hello world!”'
>>> m =, test_str)
'\x80\x9cHello world!\xe2\x80'