braymp braymp - 2 months ago 18
Python Question

Single regular expression in Python with named groups for interleaved text

I would like to create a single regular expression in Python that extracts two interleaved portions of text from a filename as named groups. An example filename is given below:


The part of the filename I'd like to extract is contained between the underscores, and consists of the following:

  • An uppercase letter:

  • A zero-padded two-digit number:

  • A period

  • A lowercase letter:

  • A single digit:

For the example above, I would like one group ('Row') to contain
, and the other group ('Column') to contain
. However, I don't know how to do this this when the text is separated as it is here.

EDIT: A constraint which I omitted: it needs to be a single regex to handle the string. I've updated the text/title to reflect this point.


Regexp capturing groups (whether numbered or named) do not actually capture text - they capture starting/ending indices within the original text. Thus, it is impossible for them to capture non-contiguous text. Probably the best thing to do here is have four separate groups, and combine them into your two desired values manually.