CUB Large Scale Deep Learning Models, Fall 2024

The "Large Scale Deep Learning Models" course focuses on the methodologies and techniques used to train large models on extensive datasets across various data domains, including images, text, and audio. The course provides in-depth coverage of self-supervised learning approaches, which have become crucial for leveraging vast amounts of unlabeled data. Topics include data preprocessing and augmentation for different modalities, architectural considerations for scaling deep learning models, and strategies for distributed and parallel training.

Instructors: Alexander Shabalin, Ildus Sadrtdinov, Dmitry Kropotov

Classes: on Mondays offline in the classroom EH-4 in time slots 14:15 - 15:30 and 15:45 - 17:00

Telegram chat for questions and discussion: link

Practical assignments: all asssignments are given and checked in the corresponding Teams space. If you don't have access to Teams space, please write directly to one of the instructors or in the course chat.

Course assessment criteria

Assessment Component 1: written examination, Duration: 60 min, Weight: 50 %

Assessment Component 2: programming assignments, Weight: 50 %

Completion: To pass this module, the examination of each module component must be passed with at least 45%.

Lectures

Date	Number	Topic	Materials
09.09.24	01	Introduction to the course. Large models, large datasets and self-supervised learning. What to do with a pretrained model? Linear probing, Fine-tuning, in-distribution (ID) and out-of-distribution (OOD) performance. CLIP model, Zero-shot and WiSE-FT (robust weights ensemble).	Fine-tuning distorts features, Comparing pre-training algorithms, CLIP, WiSE-FT, Do ImageNet Classifiers Generalize to ImageNet?
23.09.24	02	Classical pretext tasks for images: inpainting, colorization, jigsaw puzzles	Exemplar, Context Prediction, Inpainting, Jigsaw Puzzles, Colorization, Rotations, Damaged Jigsaw Puzzles, Task Ensemble
30.09.24	03	Modern architectures for images: ViT, DeiT, MLP Mixer, Swin, ConvNeXt, Neighborhood Attention Transformer (NAT). Efficient training & inference: Automatic Mixed Precision (AMP), Data-Parallel and Model-Parallel training	Big Transfer, ViT, DeiT, MLP Mixer, Swin, ConvNeXt, NAT
07.10.24	04	Contrastive learning for images. Mutual information, SimCLR, MoCo, BYOL, SimSiam, DeepCluster, SwAV. Deriving contrastive loss	SimCLR, MoCo, BYOL, SimSiam, DeepCluster, SwAV
14.10.24	05	Self-supervised learning for ViT. Masked image modeling. DINO, BEiT, MAE, MaskFeat. Different approaches for improving contrastive learning.	DINO, BEiT, MAE, MaskFeat Dense CL, Supervised CL, DiLo, LooC
21.10.24	06	Mode connectivity and Linear mode connectivity (LMC). Ensembling: Deep Ensemble (DE), SSE, FGE, cSGLD, KFAC-Laplace, SWAG, SPRO, StarSSE. Model averaging: SWA, model soups. Weight averaging in optimization: Lookahead, Lookaround, WATT	LMC, LMC in transfer learning, DE, DE and loss landscape, DE and distribution shifts, SSE, FGE, cSGLD, KFAC-Laplace, SWAG, SPRO, DE Equivalent, StarSSE, SWA, model soups, Lookahead, Lookaround, WATT
28.10.24	08	Modern architectures for texts. Recap of transformers. Modern architectures. Transformer training tricks.	Flash attention, FA blogpost, KV-caching, Multi-Query attention, Relative Position Encodings, RoPE, ALiBi, GLU, Mixture of Experts, Pre-normalization, RMSNorm
04.11.24	07	Pruning, Quantization, Distillation	Pruning, Quantization 1, Quantization 2, Distillation, DistilBERT
11.11.24	09	Parameter-Efficient Fine-tuning. GPT zero-shot, Prompt Tuning, Adapters, LoRA, BitFit	GPT-3, Prompt Tuning, P-Tuning, Adapters, LoRA, BitFit
18.11.24	10	Retrieval Augmented Generation
25.11.24	11	Text Diffuion Models
02.12.24	12	Introduction to audio processing. Text-to-speech (TTS): WaveNet, Tacotron 2, WaveGlow, HiFi-GAN. Automatic Speech Recognition (ASR): CTC Loss, Jasper, Whisper. Self-supervised learning for audio: CPC, Wav2Vec 2.0, HUBERT, Multi-format contrastive learning, BYOL-A, CLAP	WaveNet, Tacotron 2, WaveGlow, HiFi-GAN, CTC Loss, Jasper, Whisper, CPC, Wav2Vec 2.0, HuBERT, Multi-format CL, BYOL-A, CLAP

Home assignments

Number	Release date	Deadline	Topic
01	15.09.24	01.10.24 23:59	Robust fine-tuning of CLIP
02	01.10.24	18.10.24 23:59	Classical pre-text tasks
03	18.10.24	06.11.24 23:59	Contrastive learning

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
week01-finetune		week01-finetune
week02-pretext		week02-pretext
week03-cv-archs		week03-cv-archs
week04-contrastive		week04-contrastive
week05-masked-imade-modeling		week05-masked-imade-modeling
week06-ensembling		week06-ensembling
week07-llms		week07-llms
week08-llm-size-reduction		week08-llm-size-reduction
week09-peft		week09-peft
week10-rag		week10-rag
week11-text-dm		week11-text-dm
week12-audio		week12-audio
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUB Large Scale Deep Learning Models, Fall 2024

Course assessment criteria

Lectures

Home assignments

About

Releases

Packages

Contributors 3

Languages

License

isadrtdinov/lsdl-cub

Folders and files

Latest commit

History

Repository files navigation

CUB Large Scale Deep Learning Models, Fall 2024

Course assessment criteria

Lectures

Home assignments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages