user1179317 - 1 year ago 44
Python Question

# Efficient Algorithm to compose valid expressions with specific target

The problem is stated as: Given a string that contains only digits 0-9 and a target value, return all expressions that are created by adding some binary operators (+, -, or *) between the digits so they evaluate to the target value. In some cases there may not be any binary operators that will create valid expressions, in which case the function should return an empty array. The numbers in the new expressions should not contain leading zeros.

The function should return all valid expressions that evaluate to target, sorted lexicographically.

For example:

digits =

"123"
and target =
6
, should return:
["1*2*3", "1+2+3"]

My current algorithm is below. Its a bit slow so I am looking for a more efficient way to approach the problem. My current algo produces all combinations of the operands and operators. For the above example, it produces

Operands:

[['1', '2', '3'], ['1', '23'], ['12', '3'], ['123']]

Operators:

{0: [()], 1: [('+',), ('-',), ('*',)], 2: [('+', '+'), ('+', '-'), ('+', '*'), ('-', '+'), ('-', '-'), ('-', '*'), ('*', '+'), ('*', '-'), ('*', '*')]}

It then combines all possible combinations of operands and operators and evaluate each.

The digits has a constraint of
2 ≤ digits.length ≤ 10.
So its not that bad, but with this algo it takes around 4.3 seconds for a digit with length of 10, where it should just take 4 secs (maximum).

I also tried speeding up the eval() function using with the following alternatives:

if eval(temp) == target:

or

exp_as_func = eval('lambda: ' + temp)
if exp_as_func() == target:

or

compiled = compile(temp, '<string>', 'eval')
if compiled == target:

All of them still takes about the same amount of time using Python 3.

Code:

import itertools
import time

def getValidExp(digits, target):
def getSign_combination(length):
signCombo = {}
for i in range(0, length):
signCombo[i] = [c for c in itertools.product(('+', '-', '*'), repeat=i)]
return signCombo

def generate_combination(source, comb):
res = []
for x, action in zip(source, comb + (0,)):
res.append(x)
if action == 0:
#####IF ITS A 0, YIELD STRING. IF NOT COMBINE NEXT ONE
yield "".join(res)
res = []

#####PRODUCT GENERATES (0,0,1). ALL COMBINATIONS. 0 MEANS BY ITSELF, 1 APPEND NEXT ITEM.

elementCombo = [list(generate_combination(digits, c)) for c in itertools.product((0, 1), repeat=len(digits) - 1)]

signCombo = getSign_combination(len(digits))

result = []
for e in elementCombo:
signs = signCombo[len(e)-1]

temp = [ item for tple in zip(e, sign) for item in tple ]
temp.append(e[-1])
temp = "".join(temp)

try:
if eval(temp) == target:
result.append(temp)
except:
pass

return sorted(result)

digits = "3456237490"
target = 9180

Code using a calculator function (no eval()), almost has the same speed:

from itertools import combinations, permutations
import itertools
import time

def getValidExp(digits, target):

def calculate(s):
operands, operators = [], []
operand = ""
for i in reversed(range(len(s))):
if s[i].isdigit():
operand += s[i]
if i == 0 or not s[i - 1].isdigit():
operands.append(int(operand[::-1]))
operand = ""
elif s[i] == '*':
operators.append(s[i])
elif s[i] == '+' or s[i] == '-':
while operators and operators[-1] == '*':
compute(operands, operators)
operators.append(s[i])

while operators:
compute(operands, operators)

return operands[-1]

def compute(operands, operators):
left, right = operands.pop(), operands.pop()
op = operators.pop()
if op == '+':
operands.append(left + right)
elif op == '-':
operands.append(left - right)
elif op == '*':
operands.append(left * right)

def getSign_combination(length):
signCombo = {}
for i in range(0, length):
signCombo[i] = [c for c in itertools.product(('+', '-', '*'), repeat=i)]
return signCombo

def generate_combination(source, comb):
res = []
for x, action in zip(source, comb + (0,)):
res.append(x)
if action == 0:
yield "".join(res)
res = []

start = time.clock()

#####PRODUCT GENERATES (0,0,1). ALL COMBINATIONS. 0 MEANS BY ITSELF, 1 APPEND NEXT ITEM.
elementCombo = [list(generate_combination(digits, c)) for c in itertools.product((0, 1), repeat=len(digits) - 1)]

signCombo = getSign_combination(len(digits))

result = []
for e in elementCombo:
signs = signCombo[len(e)-1]
temp = ""
valid = True

for num in e:
if num[0] == '0' and len(num) > 1:
valid = False
break

if valid:
for num,operator in zip(e,sign):
temp += num
temp += operator

temp += e[-1]

####USING CALCULATOR CODE
if calculate(temp) == target:
result.append(temp)

print(time.clock() - start)
return sorted(result)

digits = "3456237490"
target = 9180

With this kind of programming challenge, I start by trying to answer these three questions:

• How should the expressions be represented?
• Can we reduce number of possible expressions?
• Can we do less work for each expression?

## Representing expressions

Problems that look like small programming languages tend to make me think Lisp. The problem is asking us to generate the series:

123
(* 12 3)
(+ 12 3)
...
(- (- 1 2) 3)

So, a binary expression in basically a 3-tuple of (operator, left, right) where left and right can also be expressions. The order of the components doesn't actually matter. Python has tuples, and in the operator module it has functions for the various binary ops. So, I'd plan to build expressions in the following form:

(operator.sub, (operator.sub, 1, 2), 3)

Which can then be evaluated with a (mostly) simple recursive function:

def compute(expr):
if isinstance(expr, tuple):
return expr[0](*map(compute, expr[1:]))
return expr

## Reducing possibilities

From the problem description, it seems there will be some kind of exponential number of possible expressions per digit given. Can we eliminate some of these part way through creating all the permutations?

So, for example, we have a six digit input and the target result 5. During the process of creating the permutations, imagine the following expression has been created from the first four digits, and there are two left to be handled:

(* 42 81) "??"

3696 is a big number, are any of the expressions from this point even capable of getting a result of just 5? Can we skip creating them altogether?

Unfortunately, digits near the end can still make major changes:

(+ (* (* 42 81) 0) 5)

There is likely to be some branches we could avoid, but we're going to have to consider most expressions.

## Doing less work

Okay, given we'll have to actually get the result of a very large number of expressions, is there some other way to save effort?

Lets imagine we're part way through generating a sequence, with these three final expressions generated one after the other:

...
(* (- 8 (* 3 6)) 1)
(+ (- 8 (* 3 6)) 1)
(- (- 8 (* 3 6)) 1)
...

They all give different results, [12, 13, 11], but that inner part (- 8 (* 3 6)) is the same, and will always be 12. Our solution should look to take advantage of this.

For anyone in need of spoilers, I've put up branches for an initial implementation that calculates every expression from the top, a minor change that memoises the calculation, and a final one that precomputes results as the expressions are being generated, plus some final tweaks.

• 17.40s elapsed 6180k max mem original from question
• 20.60s elapsed 6284k max mem without eval from question
• 4.65s elapsed 5356k max mem my initial
• 2.71s elapsed 5316k max mem my memoised
• 1.50s elapsed 5356k max mem my precomputed

Some notes on my implementation. The generate() function creates the possible expressions by considering each point in the string and creating the possible next states. For example, at the start, both move the marker along, and split off the first number:

'3|456237490' ->
'34|56237490' -> ...
3 '4|56237490' ->

Each pending state is pushed to a list, and the current one to consider is popped off each time through the loop. Continuing from the state at the end, the next possibilities are moving the marker along again, and splitting a number to make one of the three expressions.

3 '45|6237490' -> ...
(* 3 4) '5|6237490' -> ...
(+ 3 4) '5|6237490' -> ...
(- 3 4) '5|6237490' -> ...

I have glossed over one wrinkle with operator precedence so far. When handling multiplication, we may need to rewrite an existing expression. Consider:

(+ 1 2) '3|' ->
(* (+ 1 2) 3) '' # ???
(+ (+ 1 2) 3) ''
(- (+ 1 2) 3) ''

For addition and subtraction this is fine, order doesn't matter. However, 2 * 3 has to happen before 1 + .... In short, we need to push the multiplication inside:

(+ 1 2) 3 -> (+ 1 (* 2 3))

There are neat ways to handle this by storing a bit more information about your operations apart from just the function to execute them. For this problem that's not really required, nor are other possible transformations like combining multiple expressions or factoring out irrelevant parts.

Final implementation note, just to be difficult I made both the direction of iteration and (initially) the layout of the expressions backwards.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download