mengg - 1 year ago 133
Python Question

# Parse arithmetic string with regular expression

I need to parse an arithmetic string with only times (

`*`
) and add (
`+`
), e.g.,
`300+10*51+20+2*21`
, use regular expressions.

I have the working code below:

``````import re

input_str = '300+10*51+20+2*21'

#input_str = '1*2+3*4'

prod_re = re.compile(r"(\d+)\*(\d+)")
sum_re = re.compile(r"(\d+)\+?")

result = 0
index = 0
while (index <= len(input_str)-1):
#-----
prod_match = prod_re.match(input_str, index)
if prod_match:
# print 'find prod', prod_match.groups()
result += int(prod_match.group(1))*int(prod_match.group(2))
index += len(prod_match.group(0))+1
continue
#-----
sum_match = sum_re.match(input_str, index)
if sum_match:
# print 'find sum', sum_match.groups()
result += int(sum_match.group(1))
index += len(sum_match.group(0))
continue
#-----
if (not prod_match) and (not sum_match):
print 'None match, check input string'
break

print result
``````

I am wondering if there is a way to avoid creating the variable
`index`
above?

The algorithm seems not correct. An input of `1*2+3*4` does not yield a correct result. It seems wrong that after resolving one multiplication you continue to resolve an addition, while in some cases you would have to first resolve more multiplications before doing any additions.

With some change in the regular expressions and loops, you can achieve what you want as follows:

``````import re

input_str = '3+1*2+3*4'

# match terms, which may include multiplications
sum_re = re.compile(r"(\d+(?:\*\d+)*)(?:\+|\$)")
# match factors, which can only be numbers
prod_re = re.compile(r"\d+")

result = 0
# find terms
for sum_match in sum_re.findall(input_str):
# for each term, determine its value by applying the multiplications
product = 1
for prod_match in prod_re.findall(sum_match):
product *= int(prod_match)
# add the term's value to the result
result += product

print (result)
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download