user6750923 user6750923 - 3 months ago 96
Python Question

Can I find subject from Spacy Dependency tree using NLTK in python?

I want to find the subject from a sentence using

Spacy
. The code below is working fine and giving a dependency tree.

import spacy
from nltk import Tree

en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

def to_nltk_tree(node):
if node.n_lefts + node.n_rights > 0:
return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
else:
return node.orth_


[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]


enter image description here

From this dependency tree code, Can I find the subject of this sentence?

Answer

I'm not sure whether you want to write code using the nltk parse tree (see How to identify the subject of a sentence? ). But, spacy also generates this with the 'nsubj' label of the word.dep_ property.

import spacy
from nltk import Tree

en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

sentence = next(doc.sents) 
for word in sentence:
...     print "%s:%s" % (word,word.dep_)
... 
The:det
quick:amod
brown:amod
fox:nsubj
jumps:ROOT
over:prep
the:det
lazy:amod
dog:pobj

Reminder that there could more complicated situations where there is more than one.

>>> doc2 = en_nlp(u'When we study hard, we usually do well.')
>>> sentence2 = next(doc2.sents)
>>> for word in sentence2:
...     print "%s:%s" %(word,word.dep_)
... 
When:advmod
we:nsubj
study:advcl
hard:advmod
,:punct
we:nsubj
usually:advmod
do:ROOT
well:advmod
.:punct