-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support save load optimizer master_weights #60027
support save load optimizer master_weights #60027
Conversation
… flatten_and_dedup_for_save_load
你的PR提交成功,感谢你对开源项目的贡献! |
… flatten_and_dedup_for_save_load
… flatten_and_dedup_for_save_load
@@ -61,5 +61,47 @@ def compute_local_shape_and_global_offset( | |||
|
|||
|
|||
def flatten_state_dict(state_dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it support multiple level, i.e., {'model': {'m': {'w': xxx}}} ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
支持的
) | ||
|
||
tensor.set(load_para_np, framework._current_expected_place()) | ||
var.set_value(state_dict[var_tmp.name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
代码复用,set_value api的行为包含这部分删掉的代码逻辑,且支持设置distributed tensor赋值
@@ -74,6 +73,16 @@ def dedup_storage_metadata(global_storage_metadata): | |||
return out | |||
|
|||
|
|||
def dedup_tensor(state_dict, local_storage_metadata, dedup_storage_metadata): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,不过这个方法可以算是此文件的私有方法,不对外。comment已加
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* exclude xpu * dedup tensor in state_dict * polish * support flatten and unflatten state_dict * test flatten * rename test * fix dedup tensor test * fix test * fix load state dict * rename * fix test * support save load optimizer master weights * add comment
PR types
Others
PR changes
Others
Description
card-78318
support flatten state_dict, save load optimizer master_weights and deduplicate tensor when save state_dict