J.Joe J.Joe - 6 months ago 24
Python Question

Python Regular Expression Groups

Why does this regex print

I thought
"([abc])+" === "([abc])([abc])([abc])..."

>>> import re
>>> m = re.match("([abc])+", "abc")
>>> print m.groups()
>>> m.groups(0)
>>> m = re.match("[abc]+", "abc")
>>> m.groups()
>>> m.groups(0)


From documentation about groups

Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. The default argument is used for groups that did not participate in the match; it defaults to None.

In the first regex ([abc])+, it is matching character a or b or c but will store only the last match

Matches a or b or c
Observe carefully. Capturing groups are surrounding only the character class
So, only one character from the matched character class can be stored in capturing group.

If you want to capture string abc in a capturing group use


Above will find string composed of a or b or c and will store it in capturing group.

In second regex [abc]+, there are no capturing groups, so an empty result is shown.