WN-BERT: Integrating WordNet and BERT for Lexical Semantics in Natural Language Understanding
We propose an integration of BERT and WordNet to supplement BERT with explicit semantic knowledge for natural language understanding (NLU). Although BERT has shown its superiority in several NLU tasks, its performance seems to be relatively limited for higher level tasks involving abstraction and inference. We argue that the model’s implicit learning in context is not sufficient to infer required relationships at this level. We represent the semantic knowledge from WordNet as embeddings using path2vec and wnet2vec, and integrate this with BERT both, externally, using a top multi-layer perceptron, and internally, building on VGCN-BERT. We evaluate the performance on four GLUE tasks. We find that the combined model gives competitive results on sentiment analysis (SST-2) and linguistic acceptability (CoLA), while it does not outperform the BERT-only model on sentence similarity (STS-B) and natural language inference (RTE). Our analysis of self-attention values shows a substantial degree of attention from WordNet embeddings to relevant words for the task.