chapter3 chapter3 - 1 year ago 62
Python Question

Regular Expression for Financial Numbers

I think my chances are slim based on the responses to other questions related to regular expressions.

I am try to parse numbers in different representations:


from which I cannot control the source format.

I suppose I can come up with different regular expression for different formats. How to detect which format is which? Has it to be brute-force way to look for the letter 'K'?

z0r z0r

This kind of thing is often done by iterating over a bunch of regular expressions and stopping when you find one that matches - because your conversion from a string to a number needs special parsing beyond the capabilities of regular expressions. That means you need to order them in a way that you know will give the right answer. In this case, you might do something like this:

    (re.compile(r'([0-9]+)\(([-+0-9.]+)[mM]\)'), 1000000),
    (re.compile(r'([0-9]+)\(([-+0-9.]+)[kK]\)'), 1000),
    (re.compile(r'([0-9]+)\(([-+0-9.]+)\)'), 1),

def parse(num):
    for exp, multiplier in PARSERS:
        match = exp.match(num)
        if match is not None:
            return float(, float( * multiplier
    raise ValueError("Failed to parse")

As an aside, this pattern is common in other places too, such as deciding which function will handle a web request based on the URL.