Welcome to the "Pretraining LLMs" course! 🧑🏫 The course dives into the essential steps of pretraining large language models (LLMs).
In this course, you’ll explore pretraining, the foundational step in training LLMs, which involves teaching an LLM to predict the next token using vast text datasets.
🧠 You'll learn the essential steps to pretrain an LLM, understand the associated costs, and discover cost-effective methods by leveraging smaller, existing open-source models.
Detailed Learning Outcomes:
- 🧠 Pretraining Basics: Understand the scenarios where pretraining is the optimal choice for model performance. Compare text generation across different versions of the same model to grasp the performance differences between base, fine-tuned, and specialized pre-trained models.
- 🗃️ Creating High-Quality Datasets: Learn how to create and clean a high-quality training dataset using web text and existing datasets, and how to package this data for use with the Hugging Face library.
- 🔧 Model Configuration: Explore ways to configure and initialize a model for training, including modifying Meta’s Llama models and initializing weights either randomly or from other models.
- 🚀 Executing Training Runs: Learn how to configure and execute a training run to train your own model effectively.
- 📊 Performance Assessment: Assess your trained model’s performance and explore common evaluation strategies for LLMs, including benchmark tasks used to compare different models’ performance.
- 🧩 Pretraining Process: Gain in-depth knowledge of the steps to pretrain an LLM, from data preparation to model configuration and performance assessment.
- 🏗️ Model Architecture Configuration: Explore various options for configuring your model’s architecture, including modifying Meta’s Llama models and innovative pretraining techniques like Depth Upscaling, which can reduce training costs by up to 70%.
- 🛠️ Practical Implementation: Learn how to pretrain a model from scratch and continue the pretraining process on your own data using existing pre-trained models.
- 👨🏫 Sung Kim: CEO of Upstage, bringing extensive expertise in LLM pretraining and optimization.
- 👩🔬 Lucy Park: Chief Scientific Officer of Upstage, with a deep background in scientific research and LLM development.
🔗 To enroll in the course or for further information, visit 📚 deeplearning.ai.