-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generate benchmark input in device #10109
generate benchmark input in device #10109
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #10109 +/- ##
================================================
+ Coverage 86.13% 86.16% +0.02%
================================================
Files 139 139
Lines 22438 22447 +9
================================================
+ Hits 19328 19341 +13
+ Misses 3110 3106 -4
Continue to review full report at Codecov.
|
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from my request for documenting the normal/binomial approximation, I think this is good enough to merge on my end. Thanks again!
@karthikeyann pointed out that the binomial has been removed, I think I was getting linked back to old versions of the code when verifying those parts of the code. I think we're set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concern about the geometric distribution, plus some nitpicks.
[lower_bound, upper_bound, dist = make_normal_dist(diffType{0}, upper_bound - lower_bound)]( | ||
thrust::minstd_rand& engine, size_t size) -> rmm::device_uvector<T> { | ||
rmm::device_uvector<T> result(size, rmm::cuda_stream_default); | ||
thrust::tabulate(thrust::device, | ||
result.begin(), | ||
result.end(), | ||
abs_value_generator{lower_bound, upper_bound, engine, dist}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this emulates a geometric distribution. At least it didn't add up with some sample lower_bound
and upper_bound
values I tried on paper.
AFAICT we need a normal distribution with mean = 0, so we can use abs to make half of the bell. Then these values need to be moved/inverted so that the tip of the bell is at lower_bound
, and probability falls towards upper_bound
.
We can maybe leave this as TODO, but it might affect benchmarks in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated. added geometric distribution.
rerun tests |
1 similar comment
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing all review comments!
Looks 🔥 🔥
@gpucibot merge |
Thank you @vuule, @vyasr and @davidwendt for reviewing this big PR! 💯 |
To speedup generate benchmark input generation, move all data generation to device.
To address #5773 (comment)
This PR moves the random input generation to device.
Rest all of the original work in this PR was split to multiple PRs and merged.
#10277
#10278
#10279
#10280
#10281
#10300
With all of these changes, single iteration of all benchmark runs in <1000 seconds. (from 3067s to 964s).
Running more iterations would see higher benefit too because the benchmark is restarted several times during run which again calls benchmark input generation code.
closes #9857