Skip to content

Commit

Permalink
add unified checkpoint training args doc
Browse files Browse the repository at this point in the history
  • Loading branch information
DesmonDay committed Jan 2, 2024
1 parent 1982091 commit 2fa74f2
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions docs/trainer.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
trainer.md
# PaddleNLP Trainer API

PaddleNLP提供了Trainer训练API,针对训练过程的通用训练配置做了封装,比如:
Expand Down Expand Up @@ -661,6 +662,27 @@ Trainer 是一个简单,但功能完整的 Paddle训练和评估模块,并
The path to a folder with a valid checkpoint for your
model. (default: None)
--unified_checkpoint
是否统一混合并行训练的Checkpoint,(可选,默认为False)
Whether to unify hybrid parallel checkpoint. (default: False)
--unified_checkpoint_config
与Unified Checkpoint相关的一些优化配置项,以str形式传入配置。
支持如下选项:
skip_save_model_weight: 当master_weights存在时,跳过保存模型权重。
master_weight_compatible: 1. 仅当optimizer需要master_weights时,才进行加载;
2. 如果checkpoint中不存在master_weights,则将model weight作为master_weights进行加载。
async_save: 在保存Checkpoint至磁盘时做异步保存,不影响训练过程,提高训练效率。
enable_all_options: 上述参数全部开启。
Some additional config of Unified checkpoint, we provide some options to config.
Following config is support:
skip_save_model_weight, no need to save model weights when the master_weights exist.
master_weight_compatible, 1. if the master_weights exist, only load when needed.
2. if master_weights does not exist, convert model weights to master_weights when needed.
async_save, enable asynchronous saving checkpoints to disk.
enable_all_options, enable all unified checkpoint optimization configs.
--skip_memory_metrics
是否跳过内存profiler检测。(可选,默认为True,跳过)
Whether or not to skip adding of memory profiler reports
Expand Down

0 comments on commit 2fa74f2

Please sign in to comment.