Noi Sek Noi Sek - 4 months ago 10
Brainfuck Question

Tokenizing a string into a list of nested arrays with Python

Following this document I'm writing an interpreter for Brainfuck, which in my implementation entails turning a string such as:

',>,<[>[->+>+<<]>>[-<<+>>]<<<-]>>.'


into a list of instructions like this:

[',', '>', ',', '<', [ '>', [ '-', '>', '+', '>', '+', '<', '<', ], '>', '>', [ '-', '<', '<', '+', '>', '>', ] '<', '<', '<', '-' ], '>', '>', '.']


or, minus the symbols:

[ ... [...] ... [...] ... ]


Right now I am solving this recursively using a deque and popleft() to iterate through the string one symbol at a time, but I feel like I should be breaking it into sub-arrays all at once.

How would you solve this problem in a Pythonic way?

(Ruling out Regex for speed reasons)

Answer

it isn't exactly a "Pythonic way", but .... I find a solution to the problem using recursion and generators

s = ',>,<[>[->+>+<<]>>[-<<+>>]<<<-]>>.'

def brainfuck2list(brainfuck):
  while brainfuck:               #if list is empty then finish
    e = brainfuck.pop(0)
    if e not in ("[","]"):
      yield e
    elif e == "[":
      yield list(brainfuck2list(brainfuck))
    else:
      break

[_ for _ in brainfuck2list(list(s))]

you get following output

[
  ',', '>', ',', '<', 
  [
    '>', 
    [
      '-', '>', '+', '>', '+', '<', '<'
    ]
    , '>', '>', 
    [
      '-','<', '<', '+', '>', '>'
    ], 
    '<', '<', '<', '-'
  ]
  , '>', '>', '.'
]
Comments