Amazing Resources

List of references and online resources related to data science, machine learning and deep learning.

👍 Courses / Tutorials

👍 Cheat Sheets

👍 AWS / SageMaker

📺 Videos

📚 Books

👍 Papers

2015 Cyclical Learning Rates for Training Neural Networks (https://arxiv.org/abs/1506.01186)
2017 Decoupled Weight Decay Regularization (https://arxiv.org/abs/1711.05101)
2018 Mixed Precision Training (https://arxiv.org/pdf/1710.03740.pdf)
2020 ReadNet: A Hierarchical Transformer Framework for Web Article Readability Analysis (https://link.springer.com/chapter/10.1007/978-3-030-45439-5_3)
2021 A Survey of Transformers (https://arxiv.org/pdf/2106.04554.pdf)
2022 Formal Algorithms for Transformers (https://arxiv.org/pdf/2207.09238.pdf)
2023 Transformer models: an introduction and catalog (https://arxiv.org/pdf/2302.07730.pdf) (http://bit.ly/3YFqRn9)

📑 Articles

🖼️ CNN

↩️ RNN

⁉️ NLP

💏 NVIDIA Recommender Systems

💏 Collaborative Filtering / Recommender Systems

👫 Similarity Search / ANNS / Vector Indexing

🆎 Code Search

🌐 Search Engine / Information Retrieval

⬆️ Search Ranking

👫 Lucene / Solr / Elasticsearch / BM25

📑 General ML/DL Articles

⏱️ Time Series

📊 EDA / Data Visualization

📘 Colab

🐍 Python

🗄️ Database / Storage

🏃 Reinforcement Learning

👍 Interesting and Fun

👍 GitHub Repositories

👍 Kaggle

📘 DeepNote

👍 Blogs

👍 Company Tech Blogs

🔟 Maths

👍 Datasets

👍 Synthetic Data

🛠️ Utilities / Tools

👍 Job / Interview / DS Portfolio

💰 Salary Negotiation

👍 System Design

🆎 Algorithms / Technical Coding

📺 Videos for Algorithms / Technical Coding / Interview Prep

🔢 Bit Hacks

👍 Google foobar

🔦 PyTorch-Related

🔦 PyTorch-Related Discussions

⏩ fast.ai

📚 NVIDIA eBooks

🧞 Genomic Data Science

🤖 Transformer Architecture / Anatomy / Guide

🤖 Transformer / Attention / LLM Visualization

🤖 Transformer Maths

🤖 Transformer Libraries

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration [Paper] (https://github.com/mit-han-lab/llm-awq)
AutoAWQ (https://github.com/casper-hansen/AutoAWQ)
Microsoft DeepSpeed - Deep learning optimization software suite for both training and inference (https://github.com/microsoft/DeepSpeed) (https://www.deepspeed.ai/)
DeepSpeed Chat - Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales (https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat) (https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat)
DeepSpeed's Bag of Tricks for Speed & Scale (https://www.kolaayonrinde.com/blog/2023/07/14/deepspeed-train.html)
🤗 PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware (https://huggingface.co/blog/peft) (https://github.com/huggingface/peft)
- 🤗 PEFT Documentation (https://huggingface.co/docs/peft/index)
- 🤗 PEFT Examples (https://github.com/huggingface/peft/tree/main/examples)
- 🤗 PEFT Patch Release (https://github.com/huggingface/peft/releases)
🤗 TRL - Transformer Reinforcement Learning (https://github.com/huggingface/trl) (https://huggingface.co/docs/trl/index)
bitsandbytes - 8-bit optimizers, matrix multiplication (LLM.int8()), and quantization functions (https://github.com/TimDettmers/bitsandbytes)
GPTQ: Accurate Post-training Compression for Generative Pretrained Transformers (https://github.com/IST-DASLab/gptq)
AutoGPTQ: LLMs quantization package with user-friendly apis, based on GPTQ algorithm (https://github.com/PanQiWei/AutoGPTQ)
Quantize 🤗 Transformers models (https://huggingface.co/docs/transformers/main_classes/quantization)
Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ) (https://maartengrootendorst.substack.com/p/which-quantization-method-is-right)
Optimum-Benchmark 🏋️ (https://github.com/huggingface/optimum-benchmark)
NEFTune - add random noise to the embedding vectors of the training data during the forward pass of fine-tuning (https://github.com/neelsjain/NEFTune)
LoRA+: Efficient Low Rank Adaptation of Large Models (https://github.com/nikhil-ghosh-berkeley/loraplus)
DoRA: Weight-Decomposed Low-Rank Adaptation (https://github.com/catid/dora/tree/main)
tensor_parallel - much faster than huggingface's device_map and lightweight than vLLM? (https://github.com/BlackSamorez/tensor_parallel)
Nanotron - Minimalistic large language model 3D-parallelism training (https://github.com/huggingface/nanotron/)
FastChat - platform for training, serving, and evaluating large language model based chatbots (https://github.com/lm-sys/FastChat)
Half-Quadratic Quantization (HQQ) (https://mobiusml.github.io/hqq_blog/) (https://github.com/mobiusml/hqq)
AQLM - Extreme Compression of Large Language Models via Additive Quantization (https://github.com/Vahe1994/AQLM) (https://towardsdatascience.com/the-aqlm-quantization-algorithm-explained-8cf33e4a783e)
torchtune - A Native-PyTorch Library for LLM Fine-tuning (https://github.com/pytorch/torchtune)
torchao: PyTorch Architecture Optimization (https://github.com/pytorch/ao/)
- PyTorch Native Architecture Optimization: torchao (https://pytorch.org/blog/pytorch-native-architecture-optimization/)
Prefect - Modern workflow orchestration for data and ML engineers (https://www.prefect.io/)
Modal - serverless platform to run generative AI models, large-scale batch jobs, job queues, etc (https://modal.com/)
- Beating Proprietary Models with a Quick Fine-Tune - Finetuning Quora Embeddings (https://modal.com/blog/fine-tuning-embeddings) (https://github.com/567-labs/fastllm/blob/main/applications/finetune-quora-embeddings/Readme.md)
- Fine-tune an LLM in minutes (ft. Llama 2, CodeLlama, Mistral, etc.) (https://modal.com/docs/examples/llm-finetuning) (https://github.com/modal-labs/llm-finetuning)
Model Explorer - a powerful graph visualization tool that helps one understand, debug, and optimize ML models (https://ai.google.dev/edge/model-explorer) (https://research.google/blog/model-explorer/)
Intel AutoRound - weight-only quantization algorithm designed specifically for low-bit LLM inference (https://github.com/intel/auto-round) (https://medium.com/intel-analytics-software/autoround-sota-weight-only-quantization-algorithm-for-llms-across-hardware-platforms-99fe6eac2861)

🤖 Transformer Toolkit / Techniques / Methods

🤖 RAG

🤖 Lightning AI ⚡⚡⚡

🤖 LLM Leaderboard

🤖 LLM Evaluation

FastChat - platform for training, serving, and evaluating large language model based chatbots (https://github.com/lm-sys/FastChat)
LLM Evaluation Metrics: Everything You Need for LLM Evaluation (https://www.confident-ai.com/blog/llm-evaluation-metrics-everything-you-need-for-llm-evaluation)
DeepEval - open-source LLM evaluation framework specialized for unit testing LLM outputs (https://github.com/confident-ai/deepeval)
Language Model Evaluation Harness - based on tasks (https://github.com/EleutherAI/lm-evaluation-harness) (https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md)
LLM Task-Specific Evals that Do & Don't Work (https://eugeneyan.com/writing/evals/)
RULER: What’s the Real Context Size of Your Long-Context Language Models? Evaluate long-context language models with configurable sequence length and task complexity (https://github.com/NVIDIA/RULER)

🤖 Transformer Models / Timeline

🤖 Transformer / LLM Inference / Deployment

7 Ways To Speed Up Inference of Your Hosted LLMs (https://betterprogramming.pub/speed-up-llm-inference-83653aa24c47)
vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (https://github.com/vllm-project/vllm) (https://vllm.ai/)
- vLLM v0.6.0: 2.7x Throughput Improvement and 5x Latency Reduction (https://blog.vllm.ai/2024/09/05/perf-update.html)
🤗 TGI: Text Generation Inference - Fast optimized inference for LLMs (https://github.com/huggingface/text-generation-inference)
LMDeploy: a toolkit for compressing, deploying, and serving LLM (https://github.com/InternLM/lmdeploy)
OpenVINO: an open-source toolkit for optimizing and deploying AI inference (https://github.com/openvinotoolkit) (https://docs.openvino.ai/2023.0/home.html)
How continuous batching enables 23x throughput in LLM inference while reducing p50 latency (https://www.anyscale.com/blog/continuous-batching-llm-inference)
Squeeze more out of your GPU for LLM inference—a tutorial on Accelerate & DeepSpeed (https://preemo.medium.com/squeeze-more-out-of-your-gpu-for-llm-inference-a-tutorial-on-accelerate-deepspeed-610fce3025fd)
Performance bottlenecks in deploying LLMs—a primer for ML researchers (https://preemo.medium.com/performance-bottlenecks-in-deploying-llms-a-primer-for-ml-researchers-c2b51c2084a8)
Inference using the pre-trained Alpaca-LoRA (https://www.mlexpert.io/machine-learning/tutorials/alpaca-and-llama-inference) (https://colab.research.google.com/drive/15VstUxU48CT3mRudFrj3FIv6Z4cIXnon?usp=sharing)
Optimizing your LLM in production (https://huggingface.co/blog/optimize-llm)
StreamingLLM: Efficient Streaming Language Models with Attention Sinks (https://github.com/mit-han-lab/streaming-llm) (https://arxiv.org/pdf/2309.17453.pdf)
S-LoRA: Serving Thousands of Concurrent LoRA Adapters (https://github.com/s-lora/s-lora)
- Recipe for Serving Thousands of Concurrent LoRA Adapters (https://lmsys.org/blog/2023-11-15-slora/)
DeepSparse by Neural Magic - Sparsity-aware deep learning inference runtime for CPUs (https://github.com/neuralmagic/deepsparse/tree/main)
SparseML by Neural Magic - an open-source model optimization toolkit that enables you to create inference-optimized sparse models using pruning, quantization, and distillation algorithms (https://github.com/neuralmagic/sparseml)
Marlin - Mixed Auto-Regressive Linear kernel, an extremely optimized FP16xINT4 matmul kernel aimed at LLM inference (https://github.com/IST-DASLab/marlin)
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning (https://github.com/datamllab/LongLM) (https://www.reddit.com/r/LocalLLaMA/comments/18x8g6c/llm_maybe_longlm_selfextend_llm_context_window/)
Flash-Decoding for long-context inference (https://pytorch.org/blog/flash-decoding/)
10 Ways To Run LLMs Locally And Which One Works Best For You (https://matilabs.ai/2024/02/07/run-llms-locally/)
Towards 100x Speedup: Full Stack Transformer Inference Optimization (https://yaofu.notion.site/Towards-100x-Speedup-Full-Stack-Transformer-Inference-Optimization-43124c3688e14cffaf2f1d6cbdf26c6c)
Deploy Deep Learning Models at Scale using NVIDIA Triton Inference Server (https://github.com/decodingml/articles-code/tree/main/articles/computer_vision/deploy_deep_learning_at_scale_nvidia_triton_server)
LMDeploy - a toolkit for compressing, deploying, and serving LLM (https://github.com/InternLM/lmdeploy)
LLM Inference Series: 5. Dissecting model performance (https://medium.com/@plienhar/llm-inference-series-5-dissecting-model-performance-6144aa93168f)
How to compute LLM embeddings 3X faster with model quantization - with ONNX model quantization / ONNX transformer optimization (https://medium.com/nixiesearch/how-to-compute-llm-embeddings-3x-faster-with-model-quantization-25523d9b4ce5)
A Hitchhiker’s Guide to Speculative Decoding (https://pytorch.org/blog/hitchhikers-guide-speculative-decoding/)
Achieving Faster Open-Source Llama3 Serving with SGLang Runtime (vs. TensorRT-LLM, vLLM) (https://lmsys.org/blog/2024-07-25-sglang-llama3/) (https://github.com/sgl-project/sglang)
Awesome Production Machine Learning - open source libraries that will help you deploy, monitor, version, scale, and secure your production machine learning (https://github.com/EthicalML/awesome-production-machine-learning)
NanoFlow - a throughput-oriented high-performance serving framework for LLMs (https://github.com/efeslab/Nanoflow) (https://arxiv.org/pdf/2408.12757)
GuideLLM - evaluating and optimizing the deployment of large language models (LLMs) (https://github.com/neuralmagic/guidellm)
LLM Compressor - create compressed models for faster inference with vLLM (https://github.com/vllm-project/llm-compressor) (https://neuralmagic.com/blog/llm-compressor-is-here-faster-inference-with-vllm/)
bitnet.cpp - Official inference framework for 1-bit LLMs (https://github.com/microsoft/BitNet)
STRING - a training-free method to improve effective context length of popular RoPE-based LLMs (https://github.com/HKUNLP/STRING)

🤖 Transformer / LLM Platform / Software

GPT4All - run open-source LLMs on your own computer (https://github.com/nomic-ai/gpt4all)
LLaMA-Factory - Efficiently Fine-Tune 100+ LLMs in WebUI (https://github.com/hiyouga/LLaMA-Factory)
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs (https://github.com/facebookresearch/lingua)

🤖 Transformer / LLM Data Curator

Curating Trillion-Token Datasets: Introducing NVIDIA NeMo Data Curator (https://developer.nvidia.com/blog/curating-trillion-token-datasets-introducing-nemo-data-curator/)
ftfy: fixes text for you (https://github.com/rspeer/python-ftfy)
Cosmopedia: how to create large-scale synthetic data for pre-training (https://huggingface.co/blog/cosmopedia) (https://github.com/huggingface/cosmopedia)
DataTrove - a library to process, filter and deduplicate text data at a very large scale (https://github.com/huggingface/datatrove/)
Guidance - control how LLM output is structured (https://github.com/guidance-ai/guidance)
Large-scale Near-deduplication Behind BigCode (https://huggingface.co/blog/dedup)
Dolma Toolkit - curation of large datasets for (pre)-training ML models (https://github.com/allenai/dolma)
Distilabel - framework for synthetic data and AI feedback (https://github.com/argilla-io/distilabel)
LLM Decontaminator (https://github.com/lm-sys/llm-decontaminator)
Tutorial to demonstrate how to reproduce Zyda2 dataset, curated by Zyphra in collaboration with Nvidia using NeMo Curator (https://github.com/NVIDIA/NeMo-Curator/tree/main/tutorials/zyda2-tutorial) (https://www.zyphra.com/post/building-zyda-2)

🤖 Transformer / LLM Dataset

DBPedia (https://www.dbpedia.org/resources/individual/) (http://downloads.dbpedia.org/wiki-archive/dbpedia-version-2016-04.html) (http://downloads.dbpedia.org/2016-04/core/)
Common Crawl (https://commoncrawl.org/the-data/get-started/)
c4 - A colossal, cleaned version of Common Crawl's web crawl corpus (https://tensorflow.org/datasets/catalog/c4)
c4 processed version with five variants of the data: en, en.noclean, en.noblocklist, realnewslike, and multilingual (mC4). (https://huggingface.co/datasets/allenai/c4)
RedPajama-V2: An open dataset with 30 trillion tokens for training large language models (https://www.together.ai/blog/redpajama-data-v2) (https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2)
SlimPajama-627B - Extensively deduplicated, multi-corpora, open-source dataset for training LLM (https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and-deduplicated-version-of-redpajama) (https://huggingface.co/datasets/cerebras/SlimPajama-627B)
Falcon RefinedWeb - An English large-scale dataset (5 trillion tokens ) for the pretraining of LLM, built through stringent filtering and extensive deduplication of CommonCrawl (https://huggingface.co/datasets/tiiuae/falcon-refinedweb) (https://arxiv.org/abs/2306.01116)
Instruction tuning datasets to train (text and multi-modal) chat-based LLMs (GPT-4, ChatGPT, LLaMA, Alpaca) (https://github.com/yaodongC/awesome-instruction-dataset)
Python-Code-23k-ShareGPT (https://huggingface.co/datasets/ajibawa-2023/Python-Code-23k-ShareGPT)
UltraFeedback Binarized - A pre-processed version of the UltraFeedback dataset and was used to train Zephyr-7Β-β (https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)
[Blog post] FineWeb: decanting the web for the finest text data at scale (https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1)
- FineWeb: 15T tokens (44TB disk space) of cleaned and deduplicated english web data from CommonCrawl, for LLM pretraining (https://huggingface.co/datasets/HuggingFaceFW/fineweb)
- FineWeb-Edu: 1.3T tokens of educational web pages filtered from 🍷 FineWeb dataset (https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu)
- FineWeb-Edu-score-2: 5.4T tokens of educational web pages filtered from 🍷 FineWeb dataset (https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu-score-2)
Dolma - Used to train OLMo on 3 trillion tokens from a diverse mix of web content, academic publications, code, books, and encyclopedic materials, with quality filtering, fuzzy deduplication (https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-corpus-9a0ff4b8da64) (https://huggingface.co/datasets/allenai/dolma)
TinyStories - synthetically generated (by GPT-3.5 and GPT-4) short stories that only use a small vocabulary (https://huggingface.co/datasets/roneneldan/TinyStories)
GenQA - over 10M cleaned and deduplicated instruction samples generated from a handful of carefully designed prompts (https://huggingface.co/datasets/tomg-group-umd/GenQA)
Persona Hub - Scaling Synthetic Data Creation with 1,000,000,000 Personas (https://github.com/tencent-ailab/persona-hub) (https://huggingface.co/datasets/proj-persona/PersonaHub)
FinePersonas - detailed personas for creating customized, realistic synthetic data (https://huggingface.co/datasets/argilla/FinePersonas-v0.1)
APIGen Function-Calling Datasets (https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k)
The Tome - Compiled from 9 publicly available datasets, curated and designed for training LLMs with a focus on instruction following (https://huggingface.co/datasets/arcee-ai/The-Tome)
distilabel-intel-orca-dpo-pairs (for preference tuning) (https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs)
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens (https://blog.salesforceairesearch.com/mint-1t/) (https://huggingface.co/collections/mlfoundations/mint-1t-6690216ca4d0df7e518dde1c)
Zyda-2 is a 5-trillion token dataset composed of filtered and cross-deduplicated DCLM, FineWeb-Edu, Zyda-1, and Dolma v1.7's Common Crawl portion (https://huggingface.co/datasets/Zyphra/Zyda-2) (https://www.zyphra.com/post/building-zyda-2)

🤖 Transformer / LLM Sample Applications

🤖 MLOps

🤖 Q&A

🤖 Discussion on LLM Padding / Formatting Function

🤖 Merging weights with quantized model

🤖 Merge / Fusion / MoE

🤖 Prompt Engineering / Instructions

🤖 Agent

🤖 LLM - Misc

How Do Language Models put Attention Weights over Long Context? (https://yaofu.notion.site/How-Do-Language-Models-put-Attention-Weights-over-Long-Context-10250219d5ce42e8b465087c383a034e) (https://github.com/FranxYao/Long-Context-Data-Engineering)
AI Watermarking 101: Tools and Techniques (https://huggingface.co/blog/watermarking)
GGUF, the long way around (https://vickiboykis.com/2024/02/28/gguf-the-long-way-around/)
A Comprehensive Guide to Modeling Techniques in Mixed-Integer Linear Programming - Convert ideas into mathematical expressions to solve operations research problems (https://towardsdatascience.com/a-comprehensive-guide-to-modeling-techniques-in-mixed-integer-linear-programming-3e96cc1bc03d)
Mastering ML Configurations by leveraging OmegaConf and Hydra (https://decodingml.substack.com/p/mastering-ml-configurations-by-leveraging)
UltraChat - example training script with Accelerator (https://github.com/thunlp/UltraChat/blob/main/train/train_legacy/train.py)
Using LESS Data to Tune Models (https://www.cs.princeton.edu/~smalladi/blog/2024/04/04/dataselection/)
Techniques for training large neural networks - Data parallelism, Pipeline parallelism, Tensor parallelism, Mixture-of-Experts (MoE), and other memory saving designs (https://openai.com/research/techniques-for-training-large-neural-networks)
- Large Scale Transformer model training with Tensor Parallel (TP) (https://pytorch.org/tutorials/intermediate/TP_tutorial.html)
YaFSDP - a Sharded Data Parallelism framework (https://github.com/yandex/YaFSDP) (https://habr.com/ru/companies/yandex/articles/817509/)
Best Embedding Model — OpenAI / Cohere / Google / E5 / BGE - An In-depth Comparison of Multilingual Embedding Models (https://medium.com/@lars.chr.wiik/best-embedding-model-openai-cohere-google-e5-bge-931bfa1962dc) topic dataset (https://github.com/LarsChrWiik/lars_datasets/tree/main/topics_dataset_50)
Training and Finetuning Embedding Models with Sentence Transformers v3 (https://huggingface.co/blog/train-sentence-transformers)
What can LLMs never do? (https://www.strangeloopcanon.com/p/what-can-llms-never-do)
What We’ve Learned From A Year of Building with LLMs (https://applied-llms.org/)
Uncensor any LLM with abliteration (https://huggingface.co/blog/mlabonne/abliteration) (https://colab.research.google.com/drive/1VYm3hOcvCpbGiqKZb141gJwjdmmCcVpR?usp=sharing)
- abliterator.py (https://github.com/FailSpy/abliterator)
- TransformerLens - A library for mechanistic interpretability of GPT-style language models (https://github.com/TransformerLensOrg/TransformerLens) (https://transformerlensorg.github.io/TransformerLens/)
Training a 70B model from scratch: open-source tools, evaluation datasets, and learnings (https://imbue.com/research/70b-intro/)
- From bare metal to a 70B model: infrastructure set-up and scripts (https://imbue.com/research/70b-infrastructure/)
- Ensuring accurate model evaluations: open-sourced, cleaned datasets for models that reason and code (https://imbue.com/research/70b-evals/)
Aleksa Gordić’s Post: Amazing list of techniques for improving the stability of training large ML models (LLMs, diffusion, etc) (https://www.linkedin.com/feed/update/urn:li:activity:7215624025639645184/)
The AdEMAMix Optimizer: Better, Faster, Older. A simple modification of the Adam optimizer with a mixture of two Exponential Moving Average (EMA) (https://github.com/nanowell/AdEMAMix-Optimizer-Pytorch/)
Generating Human-level Text with Contrastive Search in Transformers (https://huggingface.co/blog/introducing-csearch)
- A Contrastive Framework for Neural Text Generation (https://github.com/yxuansu/SimCTG)
Open Source LLM Tools (https://huyenchip.com/llama-police)
- What I learned from looking at 900 most popular open source AI tools (https://huyenchip.com/2024/03/14/ai-oss.html)
Imagen - Pytorch (https://github.com/lucidrains/imagen-pytorch)
- MinImagen - A Minimal implementation of the Imagen text-to-image model(https://github.com/AssemblyAI-Community/MinImagen)
- How Imagen Actually Works (https://www.assemblyai.com/blog/how-imagen-actually-works/)
- MinImagen - Build Your Own Imagen Text-to-Image Model (https://www.assemblyai.com/blog/minimagen-build-your-own-imagen-text-to-image-model/)
How to Beat Proprietary LLMs With Smaller Open Source Models (https://www.aidancooper.co.uk/how-to-beat-proprietary-llms/)
A Guide to Structured Outputs Using Constrained Decoding (https://www.aidancooper.co.uk/constrained-decoding/)
The 6 Best LLM Tools To Run Models Locally (https://medium.com/@amosgyamfi/the-6-best-llm-tools-to-run-models-locally-eedd0f7c2bbd)
You can now train a 70b language model at home - Training LLMs with QLoRA + FSDP (https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html) (https://github.com/AnswerDotAI/fsdp_qlora/tree/main)
Bugs in LLM Training - Gradient Accumulation Fix (https://unsloth.ai/blog/gradient)
- 🤗 Fixing Gradient Accumulation (https://huggingface.co/blog/gradient_accumulation)
SynthID Text - Apply watermarks and identify AI-generated content (https://huggingface.co/blog/synthid-text) (https://deepmind.google/technologies/synthid/)

🤖 Llama

🤖 Unsloth

🤖 Transformer Alternatives

Retentive Networks (RetNet) Explained: The much-awaited Transformers-killer is here (https://medium.com/ai-fusion-labs/retentive-networks-retnet-explained-the-much-awaited-transformers-killer-is-here-6c17e3e8add8)
Mamba Explained - The State Space Model taking on Transformers (https://www.kolaayonrinde.com/blog/2024/02/11/mamba.html)
Introducing Jamba: AI21's Groundbreaking SSM-Transformer Model (https://www.ai21.com/blog/announcing-jamba)
ModuleFormer: a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts (https://github.com/IBM/ModuleFormer)
JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars (https://research.myshell.ai/jetmoe)
SUPRA: Scalable UPtraining for Recurrent Attention - uptrain a transformer to a linear RNN (https://github.com/TRI-ML/linear_open_lm) (https://huggingface.co/TRI-ML/mistral-supra)
📺 Understanding Mamba and State Space Models (https://www.youtube.com/watch?v=iskuX3Ak9Uk)

👍 Google AI/ML Use Cases

Diffusion Models

Name		Name	Last commit message	Last commit date
Latest commit History 886 Commits
DataCamp - Python For Data Science Cheat Sheet.pdf		DataCamp - Python For Data Science Cheat Sheet.pdf
Deep Learning Notes.pdf		Deep Learning Notes.pdf
LICENSE		LICENSE
LightGBM CheatSheet.jpg		LightGBM CheatSheet.jpg
README.md		README.md
big-o-cheatsheet.pdf		big-o-cheatsheet.pdf

License

Paras-96/Amazing-Resources

Folders and files

Latest commit

History

Repository files navigation

Amazing Resources

👍 Courses / Tutorials

👍 Cheat Sheets

👍 AWS / SageMaker

📺 Videos

📚 Books

👍 Papers

📑 Articles

🖼️ CNN

↩️ RNN

⁉️ NLP

💏 NVIDIA Recommender Systems

💏 Collaborative Filtering / Recommender Systems

👫 Similarity Search / ANNS / Vector Indexing

🆎 Code Search

🌐 Search Engine / Information Retrieval

⬆️ Search Ranking

👫 Lucene / Solr / Elasticsearch / BM25

📑 General ML/DL Articles

Gradient / Momentum

Optimization / Outliers / Overfitting / Regularization / Imbalance Dataset

Regression

⏱️ Time Series

📊 EDA / Data Visualization

📘 Colab

🐍 Python

🗄️ Database / Storage

🏃 Reinforcement Learning

👍 Interesting and Fun

👍 GitHub Repositories

👍 Kaggle

📘 DeepNote

👍 Blogs

👍 Company Tech Blogs

🔟 Maths

👍 Datasets

👍 Synthetic Data

🛠️ Utilities / Tools

👍 Job / Interview / DS Portfolio

💰 Salary Negotiation

👍 System Design

🆎 Algorithms / Technical Coding

📺 Videos for Algorithms / Technical Coding / Interview Prep

🔢 Bit Hacks

👍 Google foobar

🔦 PyTorch-Related

🔦 PyTorch-Related Discussions

⏩ fast.ai

📚 NVIDIA eBooks

🧞 Genomic Data Science

🤖 Transformer Architecture / Anatomy / Guide

🤖 Transformer / Attention / LLM Visualization

🤖 Transformer Maths

🤖 Transformer Libraries

🤖 Transformer Toolkit / Techniques / Methods

🤖 RAG

🤖 Lightning AI ⚡⚡⚡

🤖 LLM Leaderboard

🤖 LLM Evaluation

🤖 Transformer Models / Timeline

🤖 Transformer / LLM Inference / Deployment

🤖 Transformer / LLM Platform / Software

🤖 Transformer / LLM Data Curator

🤖 Transformer / LLM Dataset

🤖 Transformer / LLM Sample Applications

🤖 MLOps

🤖 Q&A

🤖 Discussion on LLM Padding / Formatting Function

🤖 Merging weights with quantized model

🤖 Merge / Fusion / MoE

🤖 Prompt Engineering / Instructions

🤖 Agent

🤖 LLM - Misc

🤖 Llama

Packages