Negation Detection Processes
tags: negation NLP python sentiment analysis
posted by Zeke Shore on Feb 16th, 2010
One issue of accurate sentiment analysis identified in a recent round of research is the problem of negation detection. This is the process by which a negating word (such as ‘not’) inverts the evaluative value of an affective word ( for example, “not good” is similar to saying “bad”). This can be resolved in natural language processing by identifying negating words, and then inverting the value of any positive or negative word within n-words of the negating word, where n is the window of potential negation.
In Bruno Ohan’s 2009 dissertation “Opinion Mining with with the SentWordNet Lexical Resource” (Dublin Institute of Technology), a python algorithm is presented to perform this task. While Ohan’s tests with this negation detection algorithm only yielded accuracy improvements of about 0.5%, this might be a good start point for further exploration.
# # populates array of negated terms based on document terms # negation[i] indicates if term in doc[i] is negated # def getNegationArray(doc, windowsize): PSEUDO = ( 'no increase', 'no wonder', 'no change' , 'not cause' , 'not only' , 'not necessarily' ) PRENEGATION = ( 'not' , 'no' , 'n\'t' ,'cannot', 'declined' , 'denied' , 'denies' , 'free of' , 'fails to' , 'no evidence' , 'no new' , 'no sign' , 'no suspicious' . 'no suggestion' , 'rather than', 'with no' , 'unremarkable', 'without' , 'rules out' , 'ruled out', 'rule out') POSNEGATION = ( 'unlikely', 'free', 'ruled out' ) ENDOFWINDOW = ( '.', ':', ',', 'but' , 'however' , 'nevertheless' , 'yet' , 'though' , 'although' , 'still' , 'aside from' , 'except' , 'apart from') # Initialise array vNEG = [ 0 for t in range(len(doc)) ] # Initialise window counters winstart = 0 winend = min( windowsize, len(doc) - 1 ) docsize = len(doc) i = 0 found_pseudo = 0 found_neg_fwd = 0 found_neg_bck = 0 inwindow = 0 for i in range(docsize): # # build 1-ter and 2-term strings # unigram = doc[i].split('/')[0] if i < (docsize - 1): bigram = unigram + ' ' + doc[i+1].split('/')[0] else: bigram = unigram # # Search for pseudo negations # for negterm in PSEUDO: if bigram == negterm: found_pseudo=1 ##print 'found pseudo!', bigram, i if (found_pseudo == 0): # # Look for pre negations # for negterm in PRENEGATION: if unigram == negterm or bigram == negterm: found_neg_fwd = 1 for negterm in POSNEGATION: if unigram == negterm or bigram == negterm: found_neg_bck = 1 # # If found fwd/backw negation, then negate window # if (found_neg_fwd == 1): ##print 'found forwards!', unigram, bigram, i # # negate terms forward up to window # if inwindow < windowsize: vNEG[i] = 1 inwindow+=1 else: # out of window space found_neg_fwd = 0 inwindow = 0 # # backward negation # if (found_neg_bck == 1): ##print 'found backwards!', unigram, bigram, i # # negate back until window start # for counter in range(max(winstart, i-windowsize), i): vNEG[counter] = 1 # # done with backwards negation # found_neg_bck = 0 # # now move window # for negterm in ENDOFWINDOW: if unigram == negterm or bigram == negterm: # # found end of negation, must reset windows # ##print 'found negterm!', unigram, bigram, i inwindow = 0 found_neg_fwd = 0 winstart = i winend = min( windowsize + i, len(doc) - 1 ) return vNEG