Name		Name	Last commit message	Last commit date
parent directory ..
log		log
optim		optim
precision		precision
tp_overlap_configs		tp_overlap_configs
ADD-RECIPE.md		ADD-RECIPE.md
README.md		README.md
__init__.py		__init__.py
baichuan2_7b.py		baichuan2_7b.py
chatglm3_6b.py		chatglm3_6b.py
finetune_default.py		finetune_default.py
gemma2.py		gemma2.py
gemma2_27b.py		gemma2_27b.py
gemma2_2b.py		gemma2_2b.py
gemma2_9b.py		gemma2_9b.py
gemma_2b.py		gemma_2b.py
gemma_7b.py		gemma_7b.py
gpt3_175b.py		gpt3_175b.py
hf_auto_model_for_causal_lm.py		hf_auto_model_for_causal_lm.py
llama31_405b.py		llama31_405b.py
llama31_70b.py		llama31_70b.py
llama31_8b.py		llama31_8b.py
llama3_70b.py		llama3_70b.py
llama3_70b_16k.py		llama3_70b_16k.py
llama3_70b_64k.py		llama3_70b_64k.py
llama3_8b.py		llama3_8b.py
llama3_8b_16k.py		llama3_8b_16k.py
llama3_8b_64k.py		llama3_8b_64k.py
mamba2_130m.py		mamba2_130m.py
mamba2_1_3b.py		mamba2_1_3b.py
mamba2_2_7b.py		mamba2_2_7b.py
mamba2_370m.py		mamba2_370m.py
mamba2_780m.py		mamba2_780m.py
mamba2_8b.py		mamba2_8b.py
mamba2_hybrid_8b.py		mamba2_hybrid_8b.py
mistral_7b.py		mistral_7b.py
mistral_nemo_12b.py		mistral_nemo_12b.py
mixtral_8x22b.py		mixtral_8x22b.py
mixtral_8x7b.py		mixtral_8x7b.py
mixtral_8x7b_16k.py		mixtral_8x7b_16k.py
mixtral_8x7b_64k.py		mixtral_8x7b_64k.py
nemotron.py		nemotron.py
nemotron3_22b.py		nemotron3_22b.py
nemotron3_22b_16k.py		nemotron3_22b_16k.py
nemotron3_22b_64k.py		nemotron3_22b_64k.py
nemotron3_4b.py		nemotron3_4b.py
nemotron3_8b.py		nemotron3_8b.py
nemotron4_15b.py		nemotron4_15b.py
nemotron4_15b_16k.py		nemotron4_15b_16k.py
nemotron4_15b_64k.py		nemotron4_15b_64k.py
nemotron4_340b.py		nemotron4_340b.py
phi3_mini_4k_instruct.py		phi3_mini_4k_instruct.py
qwen2.py		qwen2.py
qwen2_1p5b.py		qwen2_1p5b.py
qwen2_500m.py		qwen2_500m.py
qwen2_72b.py		qwen2_72b.py
qwen2_7b.py		qwen2_7b.py
starcoder2.py		starcoder2.py
starcoder2_15b.py		starcoder2_15b.py
starcoder2_3b.py		starcoder2_3b.py
starcoder2_7b.py		starcoder2_7b.py
starcoder_15b.py		starcoder_15b.py
t5_11b.py		t5_11b.py
t5_220m.py		t5_220m.py
t5_3b.py		t5_3b.py

README.md

NeMo LLM Recipes

This directory contains recipes for pre-training and fine-tuning large language models (LLMs) using NeMo.

A recipe in NeMo is a Python file that defines a complete configuration for training or fine-tuning an LLM. Each recipe typically includes:

Model configuration: Defines the architecture and hyperparameters of the LLM.
Training configuration: Specifies settings for the PyTorch Lightning Trainer, including distributed training strategies.
Data configuration: Sets up the data pipeline, including batch sizes and sequence lengths.
Optimization configuration: Defines the optimizer and learning rate schedule.
Logging and checkpointing configuration: Specifies how to save model checkpoints and log training progress.

Recipes are designed to be modular and extensible, allowing users to easily customize settings for their specific use cases.

Usage

Command Line Interface

You can use these recipes via the NeMo CLI (provided by NeMo-Run):

nemo llm <task> --factory <recipe_name>

Where:

<task> is either pretrain or finetune
<recipe_name> is the name of the recipe (e.g. llama3_8b)

For example:

nemo llm pretrain --factory llama3_8b

Important

When launching the recipes with multiple processes (i.e. on multiple GPUs), add the -y option to the command to avoid user confirmation prompts. For example, nemo llm pretrain --factory llama3_8b -y

Customizing Parameters

You can override any parameter in the recipe:

nemo llm pretrain --factory llama3_8b trainer.max_steps=2000

For more details around running recipes, see pre-train.

Adding a New Recipe

See ADD-RECIPE.md for instructions on how to add a new recipe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recipes

recipes

README.md

NeMo LLM Recipes

Usage

Command Line Interface

Customizing Parameters

Adding a New Recipe

Files

recipes

Directory actions

More options

Directory actions

More options

Latest commit

History

recipes

Folders and files

parent directory

README.md

NeMo LLM Recipes

Usage

Command Line Interface

Customizing Parameters

Adding a New Recipe