Skip to content

ShivamRajSharma/Transformer-Architectures-From-Scratch

Repository files navigation

Transformer Architecure From Scratch Using PyTorch

1) TRANSFORMER -

A Self attention based Encoder-Decoder Architecture. It is mostly used for

  1. Machine Translation
  2. Document Summaraization
  3. Text extraction

Paper - https://arxiv.org/abs/1706.03762

2) BERT -

A Self-attention based Encoder Architecture. It is mostly used for

  1. Sentiment Classification
  2. Named Entity Recognition
  3. Question and Answering
  4. Sentence Embedding Extraction
  5. Document Matching

Paper - https://arxiv.org/abs/1810.04805

3) GPT-1 -

A Self-attention based Decoder based Autoregressive model. It is mostly used for

  1. Sentence Completion
  2. Generating Text
  3. Sentiment Classification

Paper - https://paperswithcode.com/method/gpt

4) GPT-2 -

A Self-attention based Decoder based Autoregressive model with a slight change in architecture and trained on larger corpus of text than GPT-1. It is mostly used for

  1. Sentence Completion
  2. Generating Text
  3. Sentiment Classification

Paper - https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf

5) ViT -

A State of the art Self-attention based Encoder Architecture for Computer Vision application. It is mostly used for

  1. Image Classification
  2. Image Encoding
  3. Backbone for Object Detection

Paper - https://arxiv.org/abs/2006.03677

6) PERFORMER -

A Self-attention based Encoder-Decoder Architecture with a linear time complexity other than transformer which has quadratic time complexity. It is mostly used

  1. Machine Translation
  2. Document Summaraization
  3. Text extraction

Paper - https://arxiv.org/abs/2009.14794

Releases

No releases published

Packages

No packages published

Languages