FinanceGuyThatCantCode FinanceGuyThatCantCode - 1 year ago 64
Python Question

The most regex-y way to understand commutative operations?

I want to parse both 1.05*f and f*1.05 to be equivalent things where f is a fixed letter, the number is any positive float and the * is always between the 'f' and the float (i.e. multiplication). If there is no multiplication, then that is ok too and 'f' as the entire string is ok - so the '1.05*' is optional. Note that 1.05*f*1.05 should not work. gf*1.05 should not work and f*1.05f should break.

I am using python. I am actually having a hard time getting the f*1.05 to work by itself because f*1.05f also works - when I put a dollar sign at the end of the option multiplication and float then nothing works.

^f(\\*(\\d*[.])?\\d+)? # f*1.05 matches, but unfortunately so does f*1.05f
^f((\\*(\\d*[.])?\\d+)?)$ # the $ makes f*1.05f not match, but f*1.05 doesn't match either!

Really my question is about whether there is a clever way to make 1.05*f, f, and f*1.05 work all in one go without using a '|' operator to choose between the float being on the left or right.

Answer Source

Negative look(ahead|behind)s to the rescue:



for line in s.split("\n"):
    if re.match(p, line):



Explanation: The pattern consists of two optional groups (the number groups) and an f in the middle. The left group has a negative lookahead in front of it, matching any sequence of characters, followed by an f and an asterisk. This lookahead will therefore not match if the f somewhere further down the string is followed by an asterisk. The whole group is optional (?). The f is then followed by the same thing again, but this time checking for a *f directly before it, using a negative lookbehind. If it finds that, the group won't match, which won't break the whole regex since it's again optional.

I still don't understand why you would want that, a | is vastly superior.