Semi-Supervised Emotion Lexicon Expansion with Label Propagation
The task of emotion classification has traditionally been addressed using two different but complementary approaches: lexicon-based approaches typically have a wider coverage of emotion-bearing words whereas corpus-based approaches learn to use contextual cues. It should not come as a surprise that these methods have been used jointly to exploit the strengths of both.
However, a combination of the two techniques still suffers from the relatively limited size of the available linguistic resources. In this work, we introduce a novel variant of the Label Propagation algorithm (Zhu and Ghahramani 2002) to extend the coverage of an existing emotion lexicon. In order to do so, we construct a fully connected graph wherein words are vertices and the edges are weighted by the geometric proximity of distributional word representations. The vertices that correspond to words that occur in an emotion lexicon are initialised using the emotion distribution indicated in the lexicon. Then, the label propagation algorithm is used to derive emotion distributions for words that do not occur in the lexicon. Finally, we propose batched label propagation: an optimisation procedure which makes expansion tractable for large vocabularies.
In our experiments, we compare four emotion classifiers: the model of Mohammad and Kiritchenko (2015); a bidirectional LSTM model; a bidirectional LSTM model using an emotion lexicon; a bidirectional LSTM model using the extended emotion lexicon derived through label propagation. Our results show that the classifier that uses the expanded emotion lexicon outperforms the other models on the two deployed emotion classification benchmarks.