I am using TextBlob to perform a sentiment analysis task. I have noticed that TextBlob is able to detect the negation in some cases while in other cases not.
Here are two simple examples
>>> from textblob.sentiments import PatternAnalyzer
>>> sentiment_analyzer = PatternAnalyzer()
# example 1
>>> sentiment_analyzer.analyze('This is good')
>>> sentiment_analyzer.analyze('This is not good')
# example 2
>>> sentiment_analyzer.analyze('I am the best')
>>> sentiment_analyzer.analyze('I am not the best')
(edit: my old answer was more about general classifiers and not about PatternAnalyzer)
TextBlob uses in your code the "PatternAnalyzer". Its behaviour is briefly discribed in that document: http://www.clips.ua.ac.be/pages/pattern-en#parser
We can see that:
The pattern.en module bundles a lexicon of adjectives (e.g., good, bad, amazing, irritating, ...) that occur frequently in product reviews, annotated with scores for sentiment polarity (positive ↔ negative) and subjectivity (objective ↔ subjective).
The sentiment() function returns a (polarity, subjectivity)-tuple for the given sentence, based on the adjectives it contains,
Here's an example that shows the behaviour of the algorithm. The polarity directly depends on the adjective used.
sentiment_analyzer.analyze('player') Sentiment(polarity=0.0, subjectivity=0.0) sentiment_analyzer.analyze('bad player') Sentiment(polarity=-0.6999998, subjectivity=0.66666) sentiment_analyzer.analyze('worst player') Sentiment(polarity=-1.0, subjectivity=1.0) sentiment_analyzer.analyze('best player') Sentiment(polarity=1.0, subjectivity=0.3)
Professionnal softwares generally use complex tools based on neural networks and classifiers combined with lexical analysis. But for me, TextBlob just tries to give a result based on a direct result from the grammar analysis (here, the polarity of the adjectives). It's the source of the problem.
It does not try to check if the general sentence is negative or not (with the "not" word). It tries to check if the adjective is negated or not (as it works only with adjective, not with the general structure). Here, best is used as a noun and is not a negated adjective. So, the polarity is positive.
sentiment_analyzer.analyze('not the best') Sentiment(polarity=1.0, subjectivity=0.3)
Just remplace the order of the words to make negation over the adjective and not the whole sentence.
sentiment_analyzer.analyze('the not best') Sentiment(polarity=-0.5, subjectivity=0.3)
Here, the adjective is negated. So, the polarity is negative. It's my explaination of that "strange behaviour".
The real implementation is defined in file: https://github.com/sloria/TextBlob/blob/dev/textblob/_text.py
The interresing portion is given by:
if w in self and pos in self[w]: p, s, i = self[w][pos] # Known word not preceded by a modifier ("good"). if m is None: a.append(dict(w=[w], p=p, s=s, i=i, n=1, x=self.labeler.get(w))) # Known word preceded by a modifier ("really good"). ... else: # Unknown word may be a negation ("not good"). if negation and w in self.negations: n = w # Unknown word. Retain negation across small words ("not a good"). elif n and len(w.strip("'")) > 1: n = None # Unknown word may be a negation preceded by a modifier ("really not good"). if n is not None and m is not None and (pos in self.modifiers or self.modifier(m)): a[-1]["w"].append(n) a[-1]["n"] = -1 n = None # Unknown word. Retain modifier across small words ("really is a good"). elif m and len(w) > 2: m = None # Exclamation marks boost previous word. if w == "!" and len(a) > 0: ...
If we enter "not a good" or "not the good", it will match the else part because it's not a single adjective.
The "not a good" part will match
elif n and len(w.strip("'")) > 1: so it will reverse polarity.
not the good will not match any pattern, so, the polarity will be the same of "best".
The entire code is a succession of fine tweaking, grammar indictions (such as adding ! increases polarity, adding a smiley indicates irony, ...). It's why some particular patterns will give strange results. To handle each specific case, you must check if your sentence will match any of the if sentences in that part of the code.
I hope I'll help