Ericson Willians - 4 months ago 42

Python Question

Considering the following grammar:

`expr : expr '+' term | expr '-' term | term`

term : term '*' factor | term '/' factor | factor

factor : '(' expr ')' | identifier | number

This is my code using ply:

`from ply import lex, yacc`

tokens = [

"identifier",

"number",

"plus",

"minus",

"mult",

"div"

]

t_ignore = r" \t"

t_identifier = r"^[a-zA-Z]+$"

t_number = r"[+-]?(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?"

t_plus = r"\+"

t_minus = r"-"

t_mult = r"\*"

t_div = r"/"

def p_stmt(p):

"""stmt : expr"""

p[0] = ("stmt", p[1])

def p_expr(p):

"""expr : expr plus term

| expr minus term

| term"""

p[0] = ("expr", p[1], p[2]) # Problem here <<<

def p_term(p):

"""term : term mult factor

| term div factor

| factor"""

def p_factor(p):

"""factor : '(' expr ')'

| identifier

| number"""

if __name__ == "__main__":

lex.lex()

yacc.yacc()

data = "32 + 10"

result = yacc.parse(data)

print(result)

How am I supposed to build an AST with the expression if I can't access the operators? I could separate the functions like p_expr_plus, but in this case, I would eliminate operator precedence. The docs are not so helpful, since I'm a beginner and can't solve this problem. The best material I've found on the subject is this, but it does not consider the complexity of operator precedence.

EDIT: I can't access p2 or p[3], since I get an IndexError (It's matching the term only). In the PDF I've linked, they explicitly put the operator inside the tuple, like: ('+', p1, p2), and thus, evincing my problem considering precedence (I can't separate the functions, the expression is the expression, there should be a way to consider the pipes and access any operator).

Answer

As far as I can see, in `p[0] = ("expr", p[1], p[2])`

, p1 would be the left hand expression, p[2] would be the operator, and p[3] (that you aren't using) would be the right hand term.

Just use p[2] to determine the operator, add p[3], since you will need it, and you should be good to go.

Also, you must verify how many items `p`

has, since if the last rule, `| term"""`

is matched, p will only have two items instead of four.

Take a look at a snippet from the GardenSnake example:

```
def p_comparison(p):
"""comparison : comparison PLUS comparison
| comparison MINUS comparison
| comparison MULT comparison
| comparison DIV comparison
| comparison LT comparison
| comparison EQ comparison
| comparison GT comparison
| PLUS comparison
| MINUS comparison
| power"""
if len(p) == 4:
p[0] = binary_ops[p[2]]((p[1], p[3]))
elif len(p) == 3:
p[0] = unary_ops[p[1]](p[2])
else:
p[0] = p[1]
```