Thanks to Stephan Gouws for his help on writing and improving this blog post. Transformers have recently become a competitive alternative to RNNs for a range of sequence modeling tasks. They address a significant shortcoming of RNNs, i.e. their inherently sequential computation which prevents parallelization across elements of the input sequence, whilst still addressing the […]
I’ve started a four-months internship at Google Research \o/. I am working on Natural Language Generation using Neural Computational Models, with Aliaksei Severy, Enrique Alfonseca and Sascha Rothe.