Brian Andersen Brian Andersen - 1 month ago 25
SQL Question

Stuck parsing SQL Select statement at the join clause (BNF grammar included)

I am struggling to understand how this SQL Grammar ever can be used to resolve the SQL statement presented below from a parser. I am stuck at the 'Table reference' and 'join' construction┬┤s found after when WHERE token.

BNF: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql2bnf.aug92.txt

<table reference> ::=
<table name> [ <correlation specification> ]
| <derived table> <correlation specification>
| <joined table>

<joined table> ::=
<cross join>
| <qualified join>
| <left paren> <joined table> <right paren>

<cross join> ::=
<table reference> CROSS JOIN <table reference>

<qualified join> ::=
<table reference> [ NATURAL ] [ <join type> ] JOIN <table reference> [ <join specification> ]

<join type> ::=
INNER
| <outer join type> [ OUTER ]
| UNION

<outer join type> ::= LEFT | RIGHT | FULL

<join specification> ::= <join condition> | <named columns join>

<join condition> ::= ON <search condition>

<named columns join> ::= USING <left paren> <join column list> <right paren>


SQL:

SELECT p.Name, v.Name
FROM Production.Product p
JOIN Purchasing.ProductVendor pv
ON p.ProductID = pv.ProductID
JOIN Purchasing.Vendor v
ON pv.BusinessEntityID = v.BusinessEntityID
WHERE ProductSubcategoryID = 15
ORDER BY v.Name;


Jump to the FROM caluse. Here you have one TableName, which is followed by two JOINs.

If you have a look at 'Table reference' then you see that this contains 'Table name'. This I can comprehend.

<table reference> ::=
**<table name> [ <correlation specification> ]**
| <derived table> <correlation specification>
| <joined table>


But to get the join the parser must reach the 'Joined table' which it can't because it all ready read the 'table name'.

To reach the join the parser must reach 'Qualified join', but it can't because there's no repetition in the BNF. If it somehow reached the 'Join table' element then if would be quite disappointed because the next read would be 'Table reference' again and then it would reach 'Qualifed join' again, and.... then you get a stack overflow.

<joined table> ::=
<cross join>
| <**qualified join>**
| <left paren> <joined table> <right paren>

**<qualified join>** ::=
<table reference> [ NATURAL ] [ <join type> ] JOIN <table reference> [ <join specification> ]


What am I not getting here? I'm sure there's a trick to it but I just don't see it.

I hope that some of your brilliant guys can explain to me what going in on with this to me impossibel grammar.

How can one ever construct a let's say a recursive decent parser to solve this Grammar?

What steps and/or rules does the parser need to follow?

Best regards,
Brian Andersen

Answer

That grammar is not LL(1), which is what you need to build a recursive-descent parser. I doubt whether there is an LL(1) grammar for SQL, and particularly if there is one which can generate a correct parse tree. Fortunately, that doesn't matter, since there are more powerful parsing technologies.

It is likely that you could use that grammar to build an LALR(1) parser, using a tool like bison/yacc. Or see the sqlite source code, which includes both an LALR(1) grammar and an LALR(1) parser generator called "lemon".