Effects of Context and Recency in Scaled Word Completion


  • Antal van den Bosch Centre for Language Studies, Radboud University Nijmegen


The commonly accepted method for fast and efficient word completion is storage and retrieval of character n-grams in tries. We perform learning curve experiments to measure the scaling performance of the trie approach, and present three extensions. First, we extend the trie to store characters of previous words. Second, we extend the trie to the double task of completing the current word and predicting the next word. Third, we augment the trie with a recent word buffer to account for the fact that recently used words have a high chance of recurring. Learning curve experiments on English and Dutch newspaper texts show that (1) storing the characters of previous words yields an increasing and substantial improvement over the baseline with more data, also when compared to a word-based text completion baseline; (2) simultaneously predicting the next word provides an additional small improvement; and (3) the initially large contribution of a recency model diminishes when the trie is trained on more background training data.




How to Cite

van den Bosch, A. (2011). Effects of Context and Recency in Scaled Word Completion. Computational Linguistics in the Netherlands Journal, 1, 79–94. Retrieved from https://www.clinjournal.org/clinj/article/view/8