-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XPU][Phi Kernel] xpu::nonzero support simulator XPUSIM_SKIP_RUN mode #60388
[XPU][Phi Kernel] xpu::nonzero support simulator XPUSIM_SKIP_RUN mode #60388
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
0ad5720
to
31b4997
Compare
31b4997
to
938283f
Compare
<< "WARNING: In the simulator mode, the variable non_zero_cpu " | ||
"stores an uninitialized value. To avoid allocating a memory of " | ||
"random size, we assign numel to true_num_cpu"; | ||
non_zero_cpu = x.numel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigmoid_cross_entropy_with_logits
的单测比较难写,尝试了几种方法都不好调用。这样的写法在其它cases下已经测试过,没有问题。
@@ -46,9 +46,8 @@ void NonZeroKernel(const Context& dev_ctx, | |||
std::strcmp(std::getenv("XPUSIM_SKIP_RUN"), "1") == 0) { | |||
VLOG(3) << "WARNING: In the simulator mode, the variable true_num_cpu " | |||
"stores an uninitialized value. To avoid allocating a memory of " | |||
"random size, we limit the value of true_num_cpu to the range 0 " | |||
"<= true_num_cpu < numel"; | |||
true_num_cpu = std::min(std::max(true_num_cpu, 0), static_cast<int>(numel)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
某些case下会被设为0,后续的模型代码可能会报错,因此还是将其设为最大值。单测在这个PR里面:#60224
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
VLOG(3) << "WARNING: In the simulator mode, the variable out_size_cpu " | ||
"stores an uninitialized value. To avoid allocating a memory of " | ||
"random size, we assign numel to out_size_cpu"; | ||
out_size_cpu = mask.numel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
写死最大对后续算子的影响还需要实际跑模型的时候再评估下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
相当于模拟最慢场景的性能
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
性能倒是其次,有可能会影响算子规模
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip_run拿不到不出来真实规模,也可能每个step对应的nonzero_count相关算子的规模也不一样
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
很好的意见,后续有时间可以看看这个算子在模型中的位置以及它的返回值影响的范围。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
OPs
Description
xdnn::nonzero
support simulator XPUSIM_SKIP_RUN mode.true_num_cpu
stores an uninitialized value. To avoid allocating a memory of random size, we assignnumel
to it.true_num_cpu = std::min(std::max(true_num_cpu, 0), static_cast<int>(numel));
may lead to the failure of the following kernel in the case thattrue_num_cpu
is set to0
.Similar to PR: #60224.