Topics Covered:
- Encoder Decoder
- Architecture of Encoder and Decoders
- Encoder Forward Pass
- Decoder Forward Pass
- Improvements to make in very basic encoder decoder architecture
-
- using embeddings
- deep lstm's
- reversing the input
-
- Attention Mechanism
- Stating the main Problems with Vanilla Encoder Decoder Architecture
- Deep dive on attention mechanism
- Bahdanau Attention VS Luong Attention
The main research paper used for this : [https://arxiv.org/pdf/1706.03762]
- Self Attention
- Introduction to Self Attention
- Explaining the need for self attention
- converting the embeddings to context aware embeddings
- Query - Key - Value concept
- Parallel Operations
- Explaination behind Scaled Dot Product
- Gemoetric Intuition
- Why named 'Self Attention'
- The problem with Self Attention - Why do we need multi head attention
- Multi Head Attention
- In depth explaination with example
- Matrix Calculation
- Problem with Self Attention - in context of sequence
- Positional Encoding
- Gradually explaining positional encoding concept from scratch using examples
- getting positional encoding for a vector solved
- THe Linear relationship property
- Layer Normalization
- Normalization
- Where to apply?
- Benefits of normalization
- Revisiting Batch Normalization
- Reasoning behind why not using batch normalization
- Explaining Layer Normalization
- Normalization
- Transformer Architecture
- Encoder
- Explaining all the parts step by step
- inputs -> tokens -> embeddings -> positional encoding -> multi head attention -> add and normalization -> feed forward network -> add and normalization -> final output
- repeat from multi head attentino to add to final output 6 times -> final output
- Explaining all the parts step by step
- Masked Self Attention
- During Training
- Sequential (time series)
- Parallel
- During Inference
- During Training
- Cross Attention
- Transformer Decoder Architecture while Training -> Non-AutoRegressive
- Transformer Decoder ARchitecture while Inference -> AutoRegressive
- Encoder