Improving Dutch sentiment analysis in Pattern


  • Lorenzo Gatti Universiteit Twente
  • Judith van Stegeren Universiteit Twente


In this paper we investigate methods for improving the sentiment analysis functionality of, the Dutch submodule of Pattern, an open-source library for web mining and natural language processing. We discuss the impact on performance of three different potential improvements: extending the module’s internal sentiment lexicon; removing subsets of neutral words from the sentiment lexicon; and improving the algorithm for combining multiple word-level sentiment ratings into a sentence-level sentiment rating. We evaluated the improvements on datasets from the product review domain (books, clothing and music) and a dataset of short emotional stories. The experiments show that lexicon expansion does not lead to better results; new normalization techniques, on the other hand, show a limited but consistent performance increase for sentiment ratings.




How to Cite

Gatti, L., & van Stegeren, J. (2020). Improving Dutch sentiment analysis in Pattern. Computational Linguistics in the Netherlands Journal, 10, 73–89. Retrieved from