Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added fp32 / bf16 forward and backward elementwise_div_mkldnn operator #36158

Merged

Conversation

ghost
Copy link

@ghost ghost commented Sep 27, 2021

PR types

New features

PR changes

OPs

Describe

Adds elementwise_div fp32/bf16 FWD & BWD oneDNN op kernel.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@ghost ghost changed the title [WIP] Add elementwise_div_mkldnn Added elementwise_div_mkldnn operator Sep 28, 2021
@ghost ghost marked this pull request as ready for review September 28, 2021 14:03
jczaja
jczaja previously approved these changes Sep 28, 2021
Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ghost
Copy link
Author

ghost commented Sep 30, 2021

@lanxianghit @phlrain Please review and approve.

Copy link
Contributor

@jakpiase jakpiase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from some minor comments, really good job! You can also include in title/description, that this kernel is registered for both forward and backward and supports both bf16 and fp32 dtypes

paddle/fluid/platform/mkldnn_reuse.h Outdated Show resolved Hide resolved
paddle/fluid/platform/mkldnn_reuse.h Outdated Show resolved Hide resolved
self.out = np.divide(self.x, self.y)

def test_check_grad_normal(self):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why backward pass is not checked in this scenario?

Copy link
Author

@ghost ghost Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backward pass doesn't work correctly with broadcast for the elementwise_mkldnn mul and div operators. I've consulted this with @jczaja and he said that these tests can be disabled for now. I've added a comment to notify the fact that this should be looked into.

@jakpiase jakpiase added the BF16 label Sep 30, 2021
self.out = np.divide(self.x, self.y)

def test_check_grad_normal(self):
self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.05)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5% of relative error is quite a lot, is oneDNN's precision really that bad?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original elementwise_div tests also used 5%, so that's what I sticked with. In reality, the errors are just over 1%. I've changed the relative error to 2% to make this more clear.

@ghost ghost changed the title Added elementwise_div_mkldnn operator Added fp32 / bf16 forward and backward elementwise_div_mkldnn operator Oct 1, 2021
@ghost
Copy link
Author

ghost commented Oct 1, 2021

@arlesniak Please review

jakpiase
jakpiase previously approved these changes Oct 1, 2021
Copy link
Contributor

@jakpiase jakpiase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, really good job

jczaja
jczaja previously approved these changes Oct 4, 2021
arlesniak
arlesniak previously approved these changes Oct 5, 2021
Copy link
Contributor

@arlesniak arlesniak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arlesniak
Copy link
Contributor

@Xreki Hi, please take a look at this PR, it needs approval (PR-CI-APPROVAL)

@arlesniak
Copy link
Contributor

@lidanqing-intel Could you be so kind as to help with approval for this PR, please?

lidanqing-intel
lidanqing-intel previously approved these changes Oct 9, 2021
Copy link
Contributor

@lidanqing-intel lidanqing-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@paddle-bot-old
Copy link

Sorry to inform you that dd1c0ab's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

not accepting max_relative_error
@ghost ghost dismissed stale reviews from lidanqing-intel, arlesniak, jczaja, and jakpiase via a910f52 October 21, 2021 10:12
jczaja
jczaja previously approved these changes Oct 21, 2021
@lidanqing-intel
Copy link
Contributor

lidanqing-intel commented Oct 27, 2021

@piotrekobiIntel Hi, this CheckPRTemplate CI failed do you know why? And Jacek's approval are dismissed because something changed could you please ask his approval and ask him to merge. I have approved. Thanks.

Copy link
Contributor

@lidanqing-intel lidanqing-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jczaja jczaja merged commit e92e6b0 into PaddlePaddle:develop Oct 27, 2021
ghost pushed a commit to piotrekobi/Paddle that referenced this pull request Nov 3, 2021
PaddlePaddle#36158)

* Add WIP version of elementwise_div_mkldnn without working dy grad

* Add dy gradient calculation implementation, disable broadcast tests

* Readd removed tests from static_mode_white_list

* Add bfloat16 gradient tests, remove int8 and uint8 support

* - Change the way dy grad is calculated to improve performance
- Refactor BinaryMKLDNNHandler to use a default parameter

* Change copyright year

* Refactor as suggested

* Attempt to bypass CI Approval
not accepting max_relative_error

* Fix formatting issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants