Sai Krishna Deep Sai Krishna Deep - 4 months ago 7
Python Question

Regex with multiple capturing groups doesn't match as defined

I'm testing some regex on python. The below result doesn't match anything. I want to match "

Turkey
" but it doesn't even return. I'm spent almost an hour on this without understanding why it doesn't work!

import re

regex = r'\s*\(aka\s(.*)\s((?:19|20)[0-9][0-9])'
line = " (aka Turkey (1955)) (USA) (short title)"
match = re.search(regex,line)
if match:
print match.groups()


Output : https://repl.it/CfWa

Answer

The problem with r'\s*\(aka\s(.*)\s((?:19|20)[0-9][0-9])' is that you don't escape the parentheses around the year. If you use: r'\s*\(aka\s(.*)\s*\((?:19|20)[0-9][0-9]\)' instead, you will match "Turkey ", so I suggest using something like r'\s*\(aka\s([^\s]*)\s*\((?:19|20)[0-9][0-9]\)'.

Comments