Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement for cudf::strings::like #13594

Merged
merged 3 commits into from
Jun 23, 2023

Conversation

davidwendt
Copy link
Contributor

Description

Minimizes character counting in the kernel logic for cudf::strings::like to improve overall performance especially for longer strings.
The nvbench benchmark is updated to include measurements for various strings sizes.

Reference #13048

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 20, 2023
@davidwendt davidwendt self-assigned this Jun 20, 2023
@davidwendt
Copy link
Contributor Author

Benchmark results varying row count, string width, and hit rate

| width |   rows   | hit |   Ref Time |   New Time |          Diff |   %Diff |
|-------|----------|-----|------------|------------|---------------|---------|
|  32   |  32768   |  10 |  78.213 us |  75.738 us |     -2.476 us |  -3.17% |
|  64   |  32768   |  10 | 120.757 us | 120.155 us |     -0.602 us |  -0.50% |
|  128  |  32768   |  10 | 209.670 us | 199.459 us |    -10.210 us |  -4.87% |
|  256  |  32768   |  10 | 371.507 us | 342.417 us |    -29.090 us |  -7.83% |
|  512  |  32768   |  10 | 691.633 us | 617.029 us |    -74.605 us | -10.79% |
|  32   |  262144  |  10 | 165.541 us | 154.989 us |    -10.552 us |  -6.37% |
|  64   |  262144  |  10 | 295.358 us | 274.419 us |    -20.940 us |  -7.09% |
|  128  |  262144  |  10 |   1.137 ms |   1.134 ms |     -3.266 us |  -0.29% |
|  256  |  262144  |  10 |   4.339 ms |   3.917 ms |   -421.249 us |  -9.71% |
|  512  |  262144  |  10 |  11.860 ms |  10.295 ms |  -1564.471 us | -13.19% |
|  32   | 2097152  |  10 | 844.489 us | 779.649 us |    -64.840 us |  -7.68% |
|  64   | 2097152  |  10 |   1.661 ms |   1.537 ms |   -123.573 us |  -7.44% |
|  128  | 2097152  |  10 |   5.029 ms |   4.188 ms |   -840.908 us | -16.72% |
|  256  | 2097152  |  10 |  13.220 ms |  11.240 ms |  -1979.910 us | -14.98% |
|  512  | 2097152  |  10 |  39.562 ms |  36.477 ms |  -3085.220 us |  -7.80% |
|  32   | 16777216 |  10 |   6.572 ms |   6.020 ms |   -552.706 us |  -8.41% |
|  64   | 16777216 |  10 |  13.101 ms |  12.065 ms |  -1035.502 us |  -7.90% |
|  32   |  32768   |  25 |  78.827 us |  75.234 us |     -3.593 us |  -4.56% |
|  64   |  32768   |  25 | 122.831 us | 120.722 us |     -2.109 us |  -1.72% |
|  128  |  32768   |  25 | 215.727 us | 202.329 us |    -13.398 us |  -6.21% |
|  256  |  32768   |  25 | 383.812 us | 349.003 us |    -34.809 us |  -9.07% |
|  512  |  32768   |  25 | 715.769 us | 628.885 us |    -86.883 us | -12.14% |
|  32   |  262144  |  25 | 168.246 us | 156.646 us |    -11.600 us |  -6.89% |
|  64   |  262144  |  25 | 302.123 us | 277.466 us |    -24.657 us |  -8.16% |
|  128  |  262144  |  25 |   1.353 ms |   1.337 ms |    -15.528 us |  -1.15% |
|  256  |  262144  |  25 |   4.583 ms |   4.499 ms |    -83.384 us |  -1.82% |
|  512  |  262144  |  25 |  13.948 ms |  13.024 ms |   -924.393 us |  -6.63% |
|  32   | 2097152  |  25 | 867.169 us | 801.855 us |    -65.314 us |  -7.53% |
|  64   | 2097152  |  25 |   1.733 ms |   1.593 ms |   -140.093 us |  -8.08% |
|  128  | 2097152  |  25 |   5.695 ms |   4.918 ms |   -777.188 us | -13.65% |
|  256  | 2097152  |  25 |  15.694 ms |  13.211 ms |  -2483.461 us | -15.82% |
|  512  | 2097152  |  25 |  48.404 ms |  44.897 ms |  -3507.663 us |  -7.25% |
|  32   | 16777216 |  25 |   6.682 ms |   6.209 ms |   -473.098 us |  -7.08% |
|  64   | 16777216 |  25 |  13.434 ms |  12.489 ms |   -945.664 us |  -7.04% |
|  32   |  32768   |  70 |  76.165 us |  73.003 us |     -3.162 us |  -4.15% |
|  64   |  32768   |  70 | 115.136 us | 113.413 us |     -1.723 us |  -1.50% |
|  128  |  32768   |  70 | 214.049 us | 183.532 us |    -30.517 us | -14.26% |
|  256  |  32768   |  70 | 384.011 us | 311.740 us |    -72.271 us | -18.82% |
|  512  |  32768   |  70 | 722.814 us | 561.315 us |   -161.499 us | -22.34% |
|  32   |  262144  |  70 | 152.537 us | 139.061 us |    -13.476 us |  -8.83% |
|  64   |  262144  |  70 | 291.608 us | 243.557 us |    -48.052 us | -16.48% |
|  128  |  262144  |  70 |   2.519 ms |   2.290 ms |   -228.770 us |  -9.08% |
|  256  |  262144  |  70 |   8.667 ms |   7.930 ms |   -737.450 us |  -8.51% |
|  512  |  262144  |  70 |  19.293 ms |  16.645 ms |  -2647.950 us | -13.73% |
|  32   | 2097152  |  70 | 732.142 us | 669.819 us |    -62.322 us |  -8.51% |
|  64   | 2097152  |  70 |   1.682 ms |   1.343 ms |   -339.509 us | -20.18% |
|  128  | 2097152  |  70 |  16.071 ms |  13.681 ms |  -2390.010 us | -14.87% |
|  256  | 2097152  |  70 |  56.743 ms |  47.301 ms |  -9442.141 us | -16.64% |
|  512  | 2097152  |  70 | 123.329 ms | 106.227 ms | -17102.757 us | -13.87% |
|  32   | 16777216 |  70 |   5.562 ms |   5.095 ms |   -466.818 us |  -8.39% |
|  64   | 16777216 |  70 |  13.201 ms |  10.378 ms |  -2823.075 us | -21.38% |
|  32   |  32768   | 100 |  71.013 us |  68.261 us |     -2.752 us |  -3.88% |
|  64   |  32768   | 100 | 113.338 us | 102.985 us |    -10.353 us |  -9.13% |
|  128  |  32768   | 100 | 217.440 us | 174.685 us |    -42.755 us | -19.66% |
|  256  |  32768   | 100 | 387.813 us | 303.244 us |    -84.569 us | -21.81% |
|  512  |  32768   | 100 | 737.132 us | 559.970 us |   -177.162 us | -24.03% |
|  32   |  262144  | 100 | 145.043 us | 132.511 us |    -12.532 us |  -8.64% |
|  64   |  262144  | 100 | 299.682 us | 238.776 us |    -60.905 us | -20.32% |
|  128  |  262144  | 100 |   3.200 ms |   2.875 ms |   -324.994 us | -10.16% |
|  256  |  262144  | 100 |   9.942 ms |   8.653 ms |  -1288.585 us | -12.96% |
|  512  |  262144  | 100 |  23.550 ms |  20.450 ms |  -3100.421 us | -13.17% |
|  32   | 2097152  | 100 | 692.606 us | 638.519 us |    -54.087 us |  -7.81% |
|  64   | 2097152  | 100 |   1.726 ms |   1.312 ms |   -413.838 us | -23.97% |
|  128  | 2097152  | 100 |  22.715 ms |  19.537 ms |  -3177.731 us | -13.99% |
|  256  | 2097152  | 100 |  65.925 ms |  55.736 ms | -10188.431 us | -15.45% |
|  512  | 2097152  | 100 | 161.208 ms | 138.440 ms | -22767.270 us | -14.12% |
|  32   | 16777216 | 100 |   5.266 ms |   4.845 ms |   -421.045 us |  -8.00% |
|  64   | 16777216 | 100 |  13.607 ms |  10.156 ms |  -3450.858 us | -25.36% |

@davidwendt davidwendt changed the title Performance improvement for cudf::strings::like Performance improvement for cudf::strings::like Jun 21, 2023
@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jun 21, 2023
@davidwendt davidwendt marked this pull request as ready for review June 21, 2023 16:20
@davidwendt davidwendt requested a review from a team as a code owner June 21, 2023 16:20
@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 6aad528 into rapidsai:branch-23.08 Jun 23, 2023
@davidwendt davidwendt deleted the like-perf branch June 23, 2023 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants