Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.40】为 Paddle 新增 ASGD API RFC #747

Merged
merged 8 commits into from
Jan 24, 2024

Conversation

WintersMontagne10335
Copy link
Contributor

Copy link

paddle-bot bot commented Nov 11, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

@WintersMontagne10335 WintersMontagne10335 changed the title 【Hackathon 5th No.41】为 Paddle 新增 ASGD API RFC 【Hackathon 5th No.40】为 Paddle 新增 ASGD API RFC Nov 11, 2023
## 1、相关背景

随机平均梯度下降(以下简称 `ASGD`)是 `SGD` 以空间换时间的策略版本,是一种轨迹平均的随机优化方法。 `ASGD` 在 `SGD` 的基础上,增加了历史参数的平均值度量,让下降方向噪音的方差呈递减趋势下降,
从而使得算法最终会以线性速度收敛于最优值。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

把论文可以贴在这里,并总结一下ASGD的核心步骤

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mu.copy_(new_mu)
```

这里与原论文中 `ASGD` 的实现不太一致,但是与 `SGD` 的实现很相近,与 `SGD` 最大的不同之处为,依据 `step` 更新 `eta`(实际上的学习率)。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASGD具体应该实现成什么样,缺少哪些,补充在这里吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


添加 python 上层接口:

- `paddle.optimizer.ASGD`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该把重要的函数等也先设计好(类似于step等)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,我做完基础实现后补充哈。下周一(11.27)前会完成。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

基本版本比较简单,已经做完。
这里遇到一个问题:相比于优化版本来说,基础版本的参数是不全的。优化版本我还没理解透彻,所以现在没办法补全RFC。
最迟会在12.15日前补全。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 这里的 step 函数用的是父类 Optimizer 的,不需要重新设计
  • 论文中的优化版本是针对一种特殊情况的实现的,不具有普遍性。其中,仅有一个方法可以用到这个 API 中。工作量比较小,三天内可以完成
  • RFC 还有蛮多没补充完整的地方,我先把代码写完,后续会补充完整

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

今天做不完了。。静态图单测卡了很久,刚刚才解决。目前单测写了八成,还需要再排查一下是否有需要补充的地方。
大概明天能做完。


TODO

# 六、测试和验收的考量
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

也需要构建边界场景

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

parameters=None,
weight_decay=None,
grad_clip=None,
name=None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add parameter of multi_precision ? to be same with code #58834

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@luotao1 luotao1 merged commit a40c787 into PaddlePaddle:master Jan 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants