Solid Coder Solid Coder - 5 months ago 6
Python Question

How to parse function calls in loop?

i am writing a simple language to describe sequences of function calls.

I am using a python, but only algorithm answers also be accepted.

For example i have a code:

for 2:
{
a
for 3:
{
b
c
}
}


How can i proceed it to a such sequence? ( for n: {block} where n is times block appeared )

a
b
c
b
c
b
c
a
b
c
b
c
b
c

I know that there is exists lexers and tokens, but how can i do it much simpler? Because language hasn't any more constructions and needed only for describing such sequences. Tokens will be very diffucult for me now ( but if you post a code i will be very happy :) )

Thanks

Answer

Note that I do not have any experience with parsing and very limited experience with regex. This was more a challenge for myself more than a solution but regardless it may be useful to you.

Your syntax isn't that far off from a python generator, a valid python generator that produces the values you want would look like this:

def temp():
    for _ in range(2):
      yield 'a'
      for _ in range(3):
        yield 'b'
        yield 'c'

So there are only two substitutions you would have to make, the for n into for _ in range(n):

def sub_for(match):
    return "_ in range({})".format(match.group(0))

def my_code_to_generator(code):
    # match a number that is preceded by "for " and right before a ":"
    code = re.sub("(?<=for )\d+(?=:)",sub_for,code)    
    ...

And changing arbitrary letters a into yield statements yield 'a':

def sub_letter(match):
    return "yield {!r}".format(match.group(0))

def my_code_to_generator(code):
    code = re.sub("(?<=for )\d+(?=:)",sub_for,code)
    #match a single character that has whitespace around it.
    code = re.sub("(?<=\s)[A-Za-z](?=\s)", sub_letter, code)
    ....

Then putting it in a def statement and executing it as python code would produce an iterator that generates the characters you desire:

import re

def sub_for(match):
    return "_ in range({})".format(match.group(0))

def sub_letter(match):
    return "yield {!r}".format(match.group(0))

def my_code_to_generator(code):
    code = re.sub("(?<=for )\d+(?=:)",sub_for,code)
    code = re.sub("(?<=\s)[A-Za-z](?=\s)", sub_letter, code)
    code = "def temp():\n    " + code.replace("\n","\n    ")
    namespace  = {}
    exec(code,namespace)
    return namespace["temp"]()

text = """
for 2:
{
  a
  for 3:
  {
    b
    c
  }
}""".replace("{","").replace("}","") #no curly braces in python!

>>> list(my_code_to_generator(text))
['a', 'b', 'c', 'b', 'c', 'b', 'c', 'a', 'b', 'c', 'b', 'c', 'b', 'c']
>>> "".join(my_code_to_generator(text))
'abcbcbcabcbcbc'

Yes I realize that this is a very impractical and clunky solution, I don't expect this to be a final answer but until someone posts a better one it may let you get some results. :)