Sudip Das Sudip Das - 10 months ago 77
Python Question

NLTK fcfg grammar using python

I am new in NLP. I found this file in nltk_data.
I am trying to write my own grammar. Before write these I need to know what is the meaning of (SEM=(?v + ?pp)).
please help me to know this.

% start S

S[SEM=(?np + WHERE + ?vp)] -> NP[SEM=?np] VP[SEM=?vp]

VP[SEM=(?v + ?pp)] -> IV[SEM=?v] PP[SEM=?pp]
VP[SEM=(?v + ?ap)] -> IV[SEM=?v] AP[SEM=?ap]
NP[SEM=(?det + ?n)] -> Det[SEM=?det] N[SEM=?n]
PP[SEM=(?p + ?np)] -> P[SEM=?p] NP[SEM=?np]
AP[SEM=?pp] -> A[SEM=?a] PP[SEM=?pp]

NP[SEM='Country="greece"'] -> 'Greece'
NP[SEM='Country="china"'] -> 'China'

Det[SEM='SELECT'] -> 'Which' | 'What'

N[SEM='City FROM city_table'] -> 'cities'

IV[SEM=''] -> 'are'
A[SEM=''] -> 'located'
P[SEM=''] -> 'in'

Answer Source

In this line:

VP[SEM=(?v + ?pp)] -> IV[SEM=?v] PP[SEM=?pp]

Short answer: SEM=(?v + ?pp) is concatenating two strings, first one from SEM feature of nodes in V and second coming from PP.

Longer answer: The grammar gives you a parser which you can form a tree structure for a given text. On each node of this tree, you can manipulate features while you building them up with the parser. This way you can form a semantic representation for the tree. Between these representations, you can see first order logic and lambda functions, and a useful representation is SQL query which you are using here. Depending on the parser you may need different manipulations of features. On lambda function, the parser needs function compositions and logical operations, and here on SQL-queries the parser mainly works with string concatenation.


In this syntax, each line of rule has two sides (left hand side, and right hand side): lhs -> rhs The parser is expected to find a tree which parent-nodes are one of non-terminal lhs nodes and follow one of these rules as they only take children from rhs.

Then, for the Semantic part, the semantic representation of parent-node is the result of composing representations of its direct children.

In this syntax, on right hand side, you can use question mark like: ?var to handle agreements between siblings on right hand side. Then you can pass the value to left hand side by assigning them to another feature (or manipulate them if your parser supports such thing like function application or logical manipulation). The semantic representation of parent-node basically is result of manupulation of semantic representation of its children. In FCFG, you pass semantic representations like a feature from right hand side to the left hand side.