TY - JOUR AU - de Kok, Daniël AU - Pütz, Tobias PY - 2020/12/12 Y2 - 2024/03/29 TI - Self-distillation for German and Dutch dependency parsing JF - Computational Linguistics in the Netherlands Journal JA - CLIN Journal VL - 10 IS - 0 SE - Articles DO - UR - https://www.clinjournal.org/clinj/article/view/106 SP - 91-107 AB - <p>In this paper, we explore self-distillation as a means to improve statistical dependency parsing models for Dutch and German over purely supervised training. Self-distillation (Furlanello et al. 2018) trains a new student model on the output of an existing (weaker) teacher model. In contrast to most previous work on self-distillation, we perform distillation using a large, unannotated corpus. We show that in dependency parsing as sequence labeling (Spoustov´a and Spousta 2010, Strzyz et al. 2019), self-distillation plus finetuning provides large improvements over models that use supervised training. We carry out experiments on the German T¨uBa-D/Z universal dependency (UD) treebank (C¸ ¨oltekin et al. 2017) and the UD conversion of the Dutch Lassy Small treebank (Bouma and van Noord 2017). We find that self-distillation improves German parsing accuracy of a bidirectional LSTM parser from 92.23 to 94.33 Labeled Attachment Score (LAS). Similarly, on Dutch we see improvement from 89.89 to 91.84 LAS.</p> ER -