LetsPlayYahtzee LetsPlayYahtzee - 5 months ago 93
Python Question

Why is not TextBlob using / detecting the negation?

I am using TextBlob to perform a sentiment analysis task. I have noticed that TextBlob is able to detect the negation in some cases while in other cases not.

Here are two simple examples

>>> from textblob.sentiments import PatternAnalyzer

>>> sentiment_analyzer = PatternAnalyzer()
# example 1
>>> sentiment_analyzer.analyze('This is good')
Sentiment(polarity=0.7, subjectivity=0.6000000000000001)

>>> sentiment_analyzer.analyze('This is not good')
Sentiment(polarity=-0.35, subjectivity=0.6000000000000001)

# example 2
>>> sentiment_analyzer.analyze('I am the best')
Sentiment(polarity=1.0, subjectivity=0.3)

>>> sentiment_analyzer.analyze('I am not the best')
Sentiment(polarity=1.0, subjectivity=0.3)


As you can see in the second example when using the adjective
best
the polarity is not changing. I suspect that has to do with the fact that the adjective
best
is a very strong indicator, but doesn't seem right because the negation should have reversed the polarity (in my understanding).

Can anyone explain a little bit what's going? Is textblob using some negation mechanism at all or is it just that the word
not
is adding negative sentiment to the sentence? In either case, why does the second example has exactly the same sentiment in both cases? Is there any suggestion about how to overcome such obstacles?

Answer

(edit: my old answer was more about general classifiers and not about PatternAnalyzer)

TextBlob uses in your code the "PatternAnalyzer". Its behaviour is briefly discribed in that document: http://www.clips.ua.ac.be/pages/pattern-en#parser

We can see that:

The pattern.en module bundles a lexicon of adjectives (e.g., good, bad, amazing, irritating, ...) that occur frequently in product reviews, annotated with scores for sentiment polarity (positive ↔ negative) and subjectivity (objective ↔ subjective).

The sentiment() function returns a (polarity, subjectivity)-tuple for the given sentence, based on the adjectives it contains,

Here's an example that shows the behaviour of the algorithm. The polarity directly depends on the adjective used.

sentiment_analyzer.analyze('player')
Sentiment(polarity=0.0, subjectivity=0.0)

sentiment_analyzer.analyze('bad player')
Sentiment(polarity=-0.6999998, subjectivity=0.66666)

sentiment_analyzer.analyze('worst player')
Sentiment(polarity=-1.0, subjectivity=1.0)

sentiment_analyzer.analyze('best player')
Sentiment(polarity=1.0, subjectivity=0.3)

Professionnal softwares generally use complex tools based on neural networks and classifiers combined with lexical analysis. But for me, TextBlob just tries to give a result based on a direct result from the grammar analysis (here, the polarity of the adjectives). It's the source of the problem.

It does not try to check if the general sentence is negative or not (with the "not" word). It tries to check if the adjective is negated or not (as it works only with adjective, not with the general structure). Here, best is used as a noun and is not a negated adjective. So, the polarity is positive.

sentiment_analyzer.analyze('not the best')
Sentiment(polarity=1.0, subjectivity=0.3)

Just remplace the order of the words to make negation over the adjective and not the whole sentence.

sentiment_analyzer.analyze('the not best')
Sentiment(polarity=-0.5, subjectivity=0.3)

Here, the adjective is negated. So, the polarity is negative. It's my explaination of that "strange behaviour".


The real implementation is defined in file: https://github.com/sloria/TextBlob/blob/dev/textblob/_text.py

The interresing portion is given by:

if w in self and pos in self[w]:
    p, s, i = self[w][pos]
    # Known word not preceded by a modifier ("good").
    if m is None:
        a.append(dict(w=[w], p=p, s=s, i=i, n=1, x=self.labeler.get(w)))
    # Known word preceded by a modifier ("really good").

    ...


else:
    # Unknown word may be a negation ("not good").
    if negation and w in self.negations:
        n = w
    # Unknown word. Retain negation across small words ("not a good").
    elif n and len(w.strip("'")) > 1:
        n = None
    # Unknown word may be a negation preceded by a modifier ("really not good").
    if n is not None and m is not None and (pos in self.modifiers or self.modifier(m[0])):
        a[-1]["w"].append(n)
        a[-1]["n"] = -1
        n = None
    # Unknown word. Retain modifier across small words ("really is a good").
    elif m and len(w) > 2:
        m = None
    # Exclamation marks boost previous word.
    if w == "!" and len(a) > 0:

    ...

If we enter "not a good" or "not the good", it will match the else part because it's not a single adjective.

The "not a good" part will match elif n and len(w.strip("'")) > 1: so it will reverse polarity. not the good will not match any pattern, so, the polarity will be the same of "best".

The entire code is a succession of fine tweaking, grammar indictions (such as adding ! increases polarity, adding a smiley indicates irony, ...). It's why some particular patterns will give strange results. To handle each specific case, you must check if your sentence will match any of the if sentences in that part of the code.

I hope I'll help

Comments