Dylan Siegler Dylan Siegler - 1 month ago 11
Python Question

How to Determine the Least Important Word to the Meaning of a Sentence

Is there any algorithm or way you could think of to determine the least important word to the meaning of a sentence? More generally, is there any way to assign some number to each word based on its importance in a sentence? By "importance" I mean that if you were to remove this word from the sentence it would have little effect to the meaning (low importance) or a large effect to the meaning (high importance).


This is a very vague question. From what I understand, you want to do something like keyword extraction.

POS Tagging is a good start. It lets you tag sentences to their parts of speech (Nouns, verbs adjectives etc) - POS Tag NLTK. You can then write your own rules to extract just the parts of speech that interest you.

Stopword Removal is another option

Keyword Extraction does a bunch of stuff you can read with examples -

  1. chunking

  2. chinking

  3. named entity recognition

  4. Building CFGs and parse trees

  5. Relation Extraction

I think reading this chapter will give the perspective and the code snippets to get you started.