Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 6th No.8】NO.8 为 Paddle 新增 FeatureAlphaDropout API #913

Merged
merged 3 commits into from
Jun 13, 2024

Conversation

megemini
Copy link
Contributor

@megemini megemini commented Jun 4, 2024

PR Category

User Experience

PR Types

Docs

Description

NO.8 为 Paddle 新增 FeatureAlphaDropout API

相关接口的 RFC

参考实现代码 PaddlePaddle/Paddle#64881

Copy link

paddle-bot bot commented Jun 4, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

}
```

可以看到,由于 `feature alpha dropout` 为 `alpha dropout` 的一种特殊情况,因此, PyTorch 将两者统一实现在以上的同一个函数中,两者的唯一区别为,`feature alpha dropout` 的 `noise` 是通过 `make_feature_noise` 生成的,而不是 `alpha dropout` 中的一个与输入相同形状的空张量:
Copy link
Contributor

@zxcd zxcd Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里通过 make_feature_noise 应该是 feature_dropout,而非 feature alpha dropout,这块需要重新调研一下。另外feature alpha dropoutalpha dropout 的一种特殊情况这点并不确切,_dropout_impl 包含了4种 dropout,alpha dropout 是 dropout 的特殊情况,你需要再看一下 feature alpha dropout 的具体分支

Copy link
Contributor Author

@megemini megemini Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我之前描述的太笼统了,我是假定:只关心我们要实现 feature_alpha_dropout ,这个前提来描述的 ~

不好意思,这里再展开说一下,帮忙看看对不对 ~

torch 的 feature_alpha_dropout 是这里的 _dropout_impl 的一种特定实现:

aten/src/ATen/native/Dropout.cpp

#define ALIAS_SPECIALIZATION(ALIAS_NAME, IS_FEATURE, IS_ALPHA)                      \
template <bool inplace, typename... Args>                                           \
Ctype<inplace> ALIAS_NAME(Args&&... args) {                                         \
  return _dropout_impl<IS_FEATURE, IS_ALPHA, inplace>(std::forward<Args>(args)...); \
}

ALIAS_SPECIALIZATION(_dropout,               false, false)
ALIAS_SPECIALIZATION(_feature_dropout,       true,  false)
ALIAS_SPECIALIZATION(_alpha_dropout,         false, true )
ALIAS_SPECIALIZATION(_feature_alpha_dropout, true,  true )

也就是说,feature_alpha_dropout 可以看成是这里的 dropout + alpha_dropout + feature_dropout 或者说是 alpha_dropout(含有 dropout 的含义) + feature_dropout。其次,feature_alpha_dropout 相对于 alpha_dropout 相当于 dropout范围 更具体 (alpha_dropout 是对每个点,feature_alpha_dropout 是针对某个通道),因此,我这里说 feature_alpha_dropoutalpha_dropout 的一种特殊情况 ~

我这里说

feature alpha dropoutnoise 是通过 make_feature_noise 生成的

的原因是,我这里只关心 feature alpha dropout 而不是 feature dropout,因此,看作 feature alpha dropout 的 noise 是通过 make_feature_noise 生成的 ~ 这里是我之前描述的不严谨 ~

其实,我更关心的是,我文中的实现方案是否有问题?

文中的方案针对:

ALIAS_SPECIALIZATION(_dropout,               false, false)
ALIAS_SPECIALIZATION(_feature_dropout,       true,  false)
ALIAS_SPECIALIZATION(_alpha_dropout,         false, true )
ALIAS_SPECIALIZATION(_feature_alpha_dropout, true,  true )

只涉及后面两种,直接复用 paddle 的 alpha_dropout ,只是 input_shape 有区别 ~

我觉得应该可以,不知道是否有问题?

谢谢!


# 六、测试和验收的考量

- **编程范式场景**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增加测试文件存放路径

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle 目前把 alpha_dropout 等 dropout 都放在 test/legacy_test/test_dropout_op.py 中 ~

我这里也这样处理,可否?


- **输入参数**
- 常规覆盖默认参数,常用参数,错误参数。
- 常规数据类型 float16, float32 or float64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要支持bfloat16

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

简单测试了一下,应该可以:

In [14]: feature_alpha_dropout(i_paddle, 0.2, training=True)
Out[14]: 
Tensor(shape=[4, 3, 3], dtype=bfloat16, place=Place(gpu:0), stop_gradient=True,
       [[[ 0.80859375,  1.07812500,  0.83203125],
         [ 0.73437500,  0.33398438,  0.96484375],
         [ 0.72656250,  1.16406250,  0.54687500]],

        [[ 0.71093750,  0.50781250,  1.15625000],
         [-1.23437500, -1.23437500, -1.23437500],
         [-1.23437500, -1.23437500, -1.23437500]],

        [[ 0.34960938,  0.34375000,  0.98046875],
         [ 0.78125000,  0.83203125,  0.98828125],
         [ 0.64843750,  1.11718750,  0.37109375]],

        [[-1.23437500, -1.23437500, -1.23437500],
         [ 0.48437500,  0.31250000,  0.60156250],
         [ 0.81640625,  1.00781250,  1.09375000]]])

@megemini megemini requested a review from zxcd June 5, 2024 05:14
Copy link
Contributor

@zxcd zxcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit d6d589c into PaddlePaddle:master Jun 13, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants