Positional Language Modelling

Positional Language Modelling

This was an independent research project through the Institute for Logic, Language, and Computation under the supervision of Dr. Elia Bruni.

In this paper, we proposed using the Positional Attention mechanism in an Attentive Language Model architecture. We evaluate it compared to an LSTM baseline and standard attention and find that it surpasses standard attention on both validation and test perplexity on both the Penn Treebank and Wikitext-02 datasets while still using fewer parameters. Using the attention distribution vectors we are able to analyze the differences between the two mechanisms and offer insight into the potential benefits of positional attention.

Standard Attention: Standard Attention

Positional Attention: Positional Attention

You can view the full PDF report here and the Github repo can be found here

Update (september/2020): This work was included as part of an official research paper and published to ACL 2020.

Gautier Dagan © 2024
rss facebook twitter github youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora google-scholar