Transformer - Attention is all you need (NLP)

Giang Tran · December 25, 2019

Transformer is a breakthrough NLP model, it completely depends on attention mechanism (eliminates convolutional and recurrent neural network). It is the core/backbone that build up many state-of-the-art models: BERT, XLNet,… It also utilizes the use of parallelism to speed up training.

The key features in Transformer:

Self-Attention Layer.
Cross-Attention Layer (Encoder-Decoder Attention Layer).
Positional Embedding.
Layer Normalization.

Share: Twitter, Facebook