Misterfrederx Misterfrederx - 24 days ago 10
Python Question

Match specific pattern with regular expression

I've to make a regex to match exactly this kind of pattern
here an example

JK+6.00,PP*2,ZZ,GROUPO

having a match for every group like

Match 1


  • JK

  • +

  • 6.00



Match 2


  • PP

  • *

  • 2



Match 3


  • ZZ



Match 4


  • GROUPO



So comma separated blocks of
(2 to 12 all capitals letters) [optional (+ or *) and a (positive number 0[.0[0]])

This block successfully parse the pattern

(?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>\*|\+)(?P<value>\d+(?:.?\d{1,2})?))?)


we have the subject group

(?P<subject>[A-Z]{2,12})


The value

(?P<value>\d+(?:.?\d{1,2})?)


All the optional operation section (value within)

(?:(?P<operation>\*|\+)(?P<value>\d+(?:.?\d{1,2})?))?


But the regex must fail if the string doesn't match EXACTLY the pattern
and that's the problem

I tried this but doesn't work

^(?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>\*|\+)(?P<value>\d+(?:.?\d{1,2})?))?)(?:,(?P=block))*$


Any suggestion?

PS. I use Python re

Answer

I'd personally go for a 2 step solution, first check that the whole string fits to your pattern, then extract the groups you want.

For the overall check you might want to use ^(?:[A-Z]{2,12}(?:[*+]\d+(?:\.\d{1,2})?)?(?:,|$))*$ as a pattern, which contains basically your pattern, the (?:,|$) to match the delimiters and anchors.

I have also adjusted your pattern a bit, to (?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>[*+])(?P<value>\d+(?:\.\d{1,2})?))?). I have replaced (?:\*|\+) with [+*] in your operation pattern and \. with .? in your value pattern.

A (very basic) python implementation could look like

import re
str='JK+6.00,PP*2,ZZ,GROUPO'
full_pattern=r'^(?:[A-Z]{2,12}(?:[*+]\d+(?:\.\d{1,2})?)?(?:,|$))*$'
extract_pattern=r'(?P<block>(?P<subject>[A-Z]{2,12})(?:(?P<operation>[*+])(?P<value>\d+(?:\.\d{1,2})?))?)'
if re.fullmatch(full_pattern, str):
    for match in re.finditer(extract_pattern, str):
        print(match.groups())

http://ideone.com/kMl9qu

Comments