Skip to content

Commit

Permalink
documentation for the checkpointing.
Browse files Browse the repository at this point in the history
  • Loading branch information
hariharan-devarajan committed Jan 8, 2024
1 parent 0c058ce commit 3f28662
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion docs/source/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,20 @@ checkpoint
- performing one checkpointing per certain number of steps specified
* - model_size
- 10240
- the size of the model in bytes
- the size of the model parameters per GPU in bytes
* - optimization_groups
- []
- List of optimization group tensors. Use Array notation for yaml.
* - num_layers
- 1
- Number of layers to checkpoint. Each layer would be checkpointed separately.
* - layer_parameters
- []
- List of parameters per layer. This is used to perform I/O per layer.
* - type
- rank_zero
- Which rank performs this checkpoint. All ranks (all_ranks) or Rank 0 (rank_zero).


.. note::

Expand Down

0 comments on commit 3f28662

Please sign in to comment.