Improving Dutch sentiment analysis in Pattern

Authors

Abstract

In this paper we investigate methods for improving the sentiment analysis functionality of Pattern.nl, the Dutch submodule of Pattern, an open-source library for web mining and natural language processing. We discuss the impact on performance of three different potential improvements: extending the module’s internal sentiment lexicon; removing subsets of neutral words from the sentiment lexicon; and improving the algorithm for combining multiple word-level sentiment ratings into a sentence-level sentiment rating. We evaluated the improvements on datasets from the product review domain (books, clothing and music) and a dataset of short emotional stories. The experiments show that lexicon expansion does not lead to better results; new normalization techniques, on the other hand, show a limited but consistent performance increase for sentiment ratings.

Downloads

Published

2020-12-12

Issue

Section

Articles

How to Cite

Improving Dutch sentiment analysis in Pattern. (2020). Computational Linguistics in the Netherlands Journal, 10, 73-89. https://www.clinjournal.org/clinj/article/view/105