diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index ec1293b81..c8823a28e 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -50,7 +50,7 @@ We use ``black`` as our style guide. To fix your format run `pip install pre-com 1. Methods should be atomic. A method shouldn't be longer than 75 lines, e.g. can be fit into the computer screen without scrolling. 1. If a method has arguments that don't fit into one line, each argument should be in its own line for readability. 1. Add ``__init__.py`` for every folder. -1. F-strings are prefered to formatted strings. +1. F-strings are preferred to formatted strings. 1. Loggers are preferred to print. Use the logger from NeMo via ``from nemo.utils import logging`` 1. Private functions (functions start with ``_``) shouldn't be called outside its host file. 1. If a comment lasts multiple lines, use ``'''`` instead of ``#``. diff --git a/README.md b/README.md index 66029e4da..acd0c0eec 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ NeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit ha The NeMo-Aligner toolkit is built using the NeMo Framework, which enables scalable training across thousands of GPUs using tensor, data, and pipeline parallelism for all alignment components. Additionally, our checkpoints are cross-compatible with the NeMo ecosystem, facilitating inference deployment and further customization (https://github.com/NVIDIA/NeMo-Aligner). -The toolkit is currently in it's early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models. +The toolkit is currently in its early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models. ## Key Features diff --git a/docs/RLHFTraining.md b/docs/RLHFTraining.md index 77113f702..68776831d 100644 --- a/docs/RLHFTraining.md +++ b/docs/RLHFTraining.md @@ -49,5 +49,5 @@ We have many optimizations available for performance that you can enable. * `model.ppo.length_params.max_length`: This sets the max tokens to generate in rollouts, it defaults to half the sequence length of the model. But if you know your model is not too verbose (e.g. after supervised fine tuning) then you can set it to a lower number. #### Critic performance optimization hyperparameters -* `trainer.ppo.combine_rm_and_critic_server`: When enabled, inference requests to the critic server will also return the rewards. This saves the need of having to run a seperate reward model server.. +* `trainer.ppo.combine_rm_and_critic_server`: When enabled, inference requests to the critic server will also return the rewards. This saves the need of having to run a separate reward model server.. * `model.offload_adam_states`: When enabled, offload the distributed adam optimizer states onto CPU during inference. This allows us to save memory during inference for a bigger `trainer.ppo.inference_micro_batch_size`. No effect if the optimizer is not distributed adam. diff --git a/docs/user-guide/rlhf.rst b/docs/user-guide/rlhf.rst index 30ac8f8bf..623c25dbe 100644 --- a/docs/user-guide/rlhf.rst +++ b/docs/user-guide/rlhf.rst @@ -407,7 +407,7 @@ We test the scaling of our TRT-LLM integration by running Llama3 70B Actor and L +------------------+-------------------+-----------------------------+----------------------+--------------------+ .. note:: - for 64x32 config we used a ``rollout_micro_batch_size`` of 16 instead of 8 due to the additional memory from the the distributed optimizer. + for 64x32 config we used a ``rollout_micro_batch_size`` of 16 instead of 8 due to the additional memory from the distributed optimizer. We also support running RLHF on Llama3.1 405B Actor and Reward Model. The following numbers are generated with ``num_rollout_samples=128``, ``global_batch_size=128``, reshard turned off, engine offloading set to False. diff --git a/docs/user-guide/steerlm.rst b/docs/user-guide/steerlm.rst index 2f98d3d99..0d6838037 100644 --- a/docs/user-guide/steerlm.rst +++ b/docs/user-guide/steerlm.rst @@ -395,7 +395,7 @@ Run Inference web_server=False \ port=1427 - Please wait for the server to be ready before proceeeding. + Please wait for the server to be ready before proceeding. #. Create Python helper functions: