Automatic detection and correction of context-dependent dt-mistakes using neural networks

Authors

  • Geert Heyman Department of Computer Science KU Leuven
  • Ivan Vuli´c Language Technology Lab, DTAL, University of Cambridge, UK
  • Yannick Laevaert Department of Computer Science KU Leuven
  • Marie-Francine Moens Department of Computer Science KU Leuven

Abstract

We introduce a novel approach to correcting context-dependent dt-mistakes, one of the most frequent spelling errors in the Dutch language. We show that by using a neural network to estimate the probability distribution of a verb’s suffix conditioned jointly on its stem and context, we obtain large improvements over state-of-the-art spell checkers on three different benchmarking datasets, achieving a perfect score on a verb spelling test from de Standaard, a Flemish newspaper. The method is unsupervised and only relies on basic preprocessing tools to tokenize the text and identify verbs, which enables training on millions of sentences. Furthermore, we propose a method to determine which words in a sentence cause the system to make corrections, which is valuable for providing feedback to the user.

Downloads

Published

2018-12-01

Issue

Section

Articles

How to Cite

Automatic detection and correction of context-dependent dt-mistakes using neural networks. (2018). Computational Linguistics in the Netherlands Journal, 8, 49-65. https://www.clinjournal.org/clinj/article/view/79

Most read articles by the same author(s)