Tim Hopper Tim Hopper - 1 year ago 81
Python Question

Extracting Prepositional Phrases from Sentence

I'm trying to extract prepositional phrases from sentences using NLTK. Is there a way for me to do this automatically (e.g. feed a function a sentence and get back its prepositional phrases)?

The examples here seem to require that you start with a grammar before you can get a parse tree. Can I automatically get the grammar and use that to get the parse tree?

Obviously I could tag a sentence, pick out prepositions and the subsequent noun, but this is complicated when the prepositional complement is compound.

Answer Source

What you really is want is to fully parse your sentence with a robust statistical parser (e.g. like Stanford), and then look for constituents marked with PP:

(ROOT
  (S
    (NP (NNP John))
    (VP (VBZ lives)
      (PP (IN in)
        (NP (DT a) (NN house)))
      (PP (IN by)
        (NP (DT the) (NN sea))))))

I am not sure about the parsing abilities of NLTK and how accurate is the parsing if this feature exists, but it's not much of a problem to call an external parser from Python and then process the output. Using a parser will save you much time and effort (since the parser takes care of everything), and is the only reliable way to do this job.