New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

【Hackathon 5th No.12】Add AdaptiveLogSoftmaxWithLoss API to Paddle #770

Merged

luotao1 merged 4 commits into PaddlePaddle:master from Patrick-Star125:softmax

Dec 11, 2023

Contributor

Patrick-Star125 commented Dec 2, 2023 •

edited

Loading

修改 AdaptiveLogSoftmaxWithLoss API 设计文档


          add AdaptiveLogSoftmaxWithLoss to Paddle

6ce15e1

paddle-bot bot commented Dec 2, 2023

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

paddle-bot bot added the contributor label

Patrick-Star125 mentioned this pull request

【Hackathon 5th No.12】Add AdaptiveLogSoftmaxWithLoss API to Paddle PaddlePaddle/Paddle#59623

Closed

2 tasks

luotao1 mentioned this pull request

【PaddlePaddle Hackathon 5th】开源贡献个人挑战赛 PaddlePaddle/Paddle#57262

Open

luotao1 assigned luotao1 and GGBond8488

GGBond8488 reviewed

View reviewed changes

rfcs/APIs/20231202_api_design_for_AdaptiveLogSoftmaxWithLoss.md Outdated


		adaptive_log_softmax_with_loss的计算分步骤如下

		$\text{head_output} = \text{linear}(\text{input}, \text{head_weight}, \text{head_bias})$

Contributor

GGBond8488 Dec 5, 2023

公式格式好像有点问题

Contributor Author

Patrick-Star125 Dec 5, 2023

用图片替代了

rfcs/APIs/20231202_api_design_for_AdaptiveLogSoftmaxWithLoss.md Outdated

+              $\text{output} += \text{take_along_axis}(\text{head_logprob}, \text{gather_inds.unsqueeze(1)}, \text{axis}=1).\text{squeeze()}$
+              $\text{loss} = -\text{output.mean()}$
               ## 3、意义
               在自然语言处理中，当字典维度过大时，embedding 将占据模型大部分参数量。
               例如机器翻译任务中，词表维度大约是2^17，embedding维度取1024，那么就会产生将近1亿参数量，

Contributor

GGBond8488 Dec 5, 2023

这个共享的说法是否准确？

Contributor Author

Patrick-Star125 Dec 5, 2023

已删除

Patrick-Star125 added 2 commits

December 5, 2023 16:06


          update

6e29bdd


          update

ec7a0af

Patrick-Star125 force-pushed the softmax branch from e5483e0 to ec7a0af Compare

December 5, 2023 08:39

GGBond8488 reviewed

View reviewed changes

rfcs/APIs/20231202_api_design_for_AdaptiveLogSoftmaxWithLoss.md Outdated


		adaptive_log_softmax_with_loss的计算分步骤如下

		![image](https://github.com/PaddlePaddle/community/assets/69072522/3d43f3e9-deb0-4d52-96be-2cd85a104b90)

Contributor

GGBond8488 Dec 7, 2023

这个图片好像还是有点问题，那个=1应该是axis=1吧，还有，把每一层在做什么也说明一下

rfcs/APIs/20231202_api_design_for_AdaptiveLogSoftmaxWithLoss.md


		layer层类API：`paddle.nn.AdaptiveLogSoftmaxWithLoss(in_features, n_classes, cutoffs, div_value=4.0, head_bias=False, name=None)`，包含两个主要方法：
		- forward(self, input, label)，用于训练，返回为`output` 和 `loss`

Contributor

GGBond8488 Dec 7, 2023

这个格式好像也有点问题

rfcs/APIs/20231202_api_design_for_AdaptiveLogSoftmaxWithLoss.md

               # 六、测试和验收的考量
               测试考虑的case如下：
-              - 数值正确性
+              - 数值正确性（CPU、GPU、动态图、静态图）

Contributor

GGBond8488 Dec 7, 2023

这个正确性准备怎么验证呢

Contributor Author

Patrick-Star125 Dec 8, 2023 •

edited

Loading

和torch一样用计算等价的方式验证，numpy一部分缺失部分API，并且该API函数逻辑比较多，所以完全复现会比较繁琐


          resolve problems

3cdd12a

Contributor Author

Patrick-Star125 commented Dec 8, 2023

Done

GGBond8488 approved these changes

View reviewed changes

Contributor

GGBond8488 left a comment

LGTM

luotao1 approved these changes

View reviewed changes

luotao1 merged commit 8058019 into PaddlePaddle:master

1 check passed

luotao1 mentioned this pull request

【Hackathon 5th No.12】为Paddle新增AdaptiveLogSoftmaxWithLoss API #713

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels