Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate benchmark input in device #10109

Merged
merged 119 commits into from
Mar 22, 2022

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Jan 24, 2022

To speedup generate benchmark input generation, move all data generation to device.
To address #5773 (comment)
This PR moves the random input generation to device.

Rest all of the original work in this PR was split to multiple PRs and merged.
#10277
#10278
#10279
#10280
#10281
#10300

With all of these changes, single iteration of all benchmark runs in <1000 seconds. (from 3067s to 964s).
Running more iterations would see higher benefit too because the benchmark is restarted several times during run which again calls benchmark input generation code.

closes #9857

@karthikeyann karthikeyann added feature request New feature or request 2 - In Progress Currently a work in progress tests Unit testing for project cuda libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue non-breaking Non-breaking change labels Jan 24, 2022
@karthikeyann karthikeyann self-assigned this Jan 24, 2022
@github-actions github-actions bot added the CMake CMake build issue label Jan 24, 2022
@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #10109 (373b08d) into branch-22.04 (4596244) will increase coverage by 0.02%.
The diff coverage is 96.86%.

❗ Current head 373b08d differs from pull request most recent head 0810425. Consider uploading reports for the commit 0810425 to get more accurate results

@@               Coverage Diff                @@
##           branch-22.04   #10109      +/-   ##
================================================
+ Coverage         86.13%   86.16%   +0.02%     
================================================
  Files               139      139              
  Lines             22438    22447       +9     
================================================
+ Hits              19328    19341      +13     
+ Misses             3110     3106       -4     
Impacted Files Coverage Δ
python/dask_cudf/dask_cudf/tests/test_accessor.py 98.41% <ø> (ø)
python/cudf/cudf/core/_base_index.py 85.92% <50.00%> (-0.51%) ⬇️
python/cudf/cudf/core/column/decimal.py 91.30% <73.68%> (-1.01%) ⬇️
python/cudf/cudf/core/column/categorical.py 89.63% <84.61%> (-0.29%) ⬇️
python/cudf/cudf/core/column/string.py 88.91% <94.44%> (+0.64%) ⬆️
python/cudf/cudf/core/column/numerical.py 95.62% <95.83%> (+0.64%) ⬆️
python/cudf/cudf/core/frame.py 91.84% <96.42%> (+0.12%) ⬆️
python/cudf/cudf/_typing.py 94.11% <100.00%> (+0.78%) ⬆️
python/cudf/cudf/api/types.py 89.79% <100.00%> (ø)
python/cudf/cudf/core/column/column.py 89.27% <100.00%> (+0.10%) ⬆️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4596244...0810425. Read the comment docs.

@karthikeyann karthikeyann requested review from vyasr and vuule March 16, 2022 00:31
@karthikeyann
Copy link
Contributor Author

rerun tests

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from my request for documenting the normal/binomial approximation, I think this is good enough to merge on my end. Thanks again!

@vyasr
Copy link
Contributor

vyasr commented Mar 16, 2022

@karthikeyann pointed out that the binomial has been removed, I think I was getting linked back to old versions of the code when verifying those parts of the code. I think we're set.

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concern about the geometric distribution, plus some nitpicks.

cpp/benchmarks/common/generate_input.hpp Outdated Show resolved Hide resolved
cpp/benchmarks/common/generate_input.hpp Outdated Show resolved Hide resolved
cpp/benchmarks/common/random_distribution_factory.hpp Outdated Show resolved Hide resolved
Comment on lines 143 to 149
[lower_bound, upper_bound, dist = make_normal_dist(diffType{0}, upper_bound - lower_bound)](
thrust::minstd_rand& engine, size_t size) -> rmm::device_uvector<T> {
rmm::device_uvector<T> result(size, rmm::cuda_stream_default);
thrust::tabulate(thrust::device,
result.begin(),
result.end(),
abs_value_generator{lower_bound, upper_bound, engine, dist});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this emulates a geometric distribution. At least it didn't add up with some sample lower_bound
and upper_bound values I tried on paper.
AFAICT we need a normal distribution with mean = 0, so we can use abs to make half of the bell. Then these values need to be moved/inverted so that the tip of the bell is at lower_bound, and probability falls towards upper_bound.
We can maybe leave this as TODO, but it might affect benchmarks in the meantime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. added geometric distribution.

cpp/benchmarks/copying/contiguous_split.cu Outdated Show resolved Hide resolved
cpp/benchmarks/common/generate_input.cu Outdated Show resolved Hide resolved
cpp/benchmarks/common/generate_input.cu Outdated Show resolved Hide resolved
cpp/benchmarks/common/generate_input.cu Outdated Show resolved Hide resolved
cpp/benchmarks/common/generate_input.cu Outdated Show resolved Hide resolved
cpp/benchmarks/common/generate_input.cu Outdated Show resolved Hide resolved
@karthikeyann karthikeyann requested a review from vuule March 18, 2022 15:48
@karthikeyann
Copy link
Contributor Author

rerun tests

1 similar comment
@karthikeyann
Copy link
Contributor Author

rerun tests

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing all review comments!
Looks 🔥 🔥

@karthikeyann karthikeyann requested a review from davidwendt March 22, 2022 12:19
@karthikeyann karthikeyann added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Mar 22, 2022
@karthikeyann
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 76c772e into rapidsai:branch-22.04 Mar 22, 2022
@karthikeyann
Copy link
Contributor Author

Thank you @vuule, @vyasr and @davidwendt for reviewing this big PR! 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue tests Unit testing for project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Improve performance of benchmark input generation
5 participants