XYZ XYZ - 4 months ago 12
Python Question

pyparsing removing some text and how to capture text with whitespace

I am new to using pyparsing (python 2.7) and have a couple of questions about this code:

import pyparsing as pp

openBrace = pp.Suppress(pp.Literal("{"))
closeBrace = pp.Suppress(pp.Literal("}"))
ident = pp.Word(pp.alphanums + "_" + ".")
otherStuff = pp.Suppress(pp.Word(pp.alphanums + "_" + "." + "-" + "+"))
comment = pp.Literal("//") + pp.restOfLine
messageName = ident
messageKw = pp.Suppress("msg")
messageExpr = pp.Forward()
messageExpr << (messageKw + messageName + openBrace +
pp.Optional(otherStuff) + pp.ZeroOrMore(messageExpr) +
pp.Optional(otherStuff) + closeBrace).ignore(comment)

print messageExpr.parseString("msg msgName1 { msg msgName2 { some text } }")


I don`t really understand why it removes the text "msg" in the inner msgName2. The output is:
['msgName1', 'Name2']
but I expected:
['msgName1', 'msgName2']

In addition, I was wondering how to capture all other text ("some text") including whitespace between the braces.

Thanks in advance

Answer

To answer your first query:

>>> import pyparsing as pp
>>> 
>>> openBrace = pp.Suppress(pp.Literal("{"))
>>> closeBrace = pp.Suppress(pp.Literal("}"))
>>> ident = pp.Word(pp.alphanums + "_" + ".")
>>> otherStuff = pp.Suppress(pp.Word(pp.alphanums + "_" + "." + "-" + "+"))
>>> comment = pp.Literal("//") + pp.restOfLine
>>> messageName = ident
>>> messageKw = pp.Suppress("msg")
>>> messageExpr = pp.Forward()
>>> messageExpr << (messageKw + messageName + openBrace +
...                 pp.ZeroOrMore(messageExpr) + pp.ZeroOrMore(otherStuff) +
...             closeBrace).ignore(comment)
Forward: ...
>>> 
>>> print messageExpr.parseString("msg msgName1 { msg msgName2 { some text } }")
['msgName1', 'msgName2']
Comments