Transformer - Attention is all you need (NLP)

Giang Tran · December 25, 2019

Transformer is a breakthrough NLP model, it completely depends on attention mechanism (eliminates convolutional and recurrent neural network). It is the core/backbone that build up many state-of-the-art models: BERT, XLNet,… It also utilizes the use of parallelism to speed up training.

The key features in Transformer:

  • Self-Attention Layer.
  • Cross-Attention Layer (Encoder-Decoder Attention Layer).
  • Positional Embedding.
  • Layer Normalization.

Twitter, Facebook