TY  - JOUR
AU  - de Kok, Daniël
AU  - Pütz, Tobias
PY  - 2020/12/12
Y2  - 2025/07/11
TI  - Self-distillation for German and Dutch dependency parsing
JF  - Computational Linguistics in the Netherlands Journal
JA  - CLIN Journal
VL  - 10
IS  - 0
SE  - Articles
DO  - 
UR  - https://www.clinjournal.org/clinj/article/view/106
SP  - 91-107
AB  - &lt;p&gt;In this paper, we explore self-distillation as a means to improve statistical dependency parsing models for Dutch and German over purely supervised training. Self-distillation (Furlanello et al. 2018) trains a new student model on the output of an existing (weaker) teacher model. In contrast to most previous work on self-distillation, we perform distillation using a large, unannotated corpus. We show that in dependency parsing as sequence labeling (Spoustov´a and Spousta 2010, Strzyz et al. 2019), self-distillation plus finetuning provides large improvements over models that use supervised training. We carry out experiments on the German T¨uBa-D/Z universal dependency (UD) treebank (C¸ ¨oltekin et al. 2017) and the UD conversion of the Dutch Lassy Small treebank (Bouma and van Noord 2017). We find that self-distillation improves German parsing accuracy of a bidirectional LSTM parser from 92.23 to 94.33 Labeled Attachment Score (LAS). Similarly, on Dutch we see improvement from 89.89 to 91.84 LAS.&lt;/p&gt;
ER  -