Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve boolean parsing performance #11308

Merged

Conversation

ketanv3
Copy link
Contributor

@ketanv3 ketanv3 commented Nov 22, 2023

Description

Bunch of modernization changes to the Booleans helper class.

  • Improved the performance of parseBoolean and isBoolean when operating on char arrays.
  • Avoid unnecessary String to char[] conversion in the BooleanFieldMapper.
  • Removed deprecated/unused methods and their tests.
  • Moved UTs to the correct module.
  • Updated docs.

Micro-benchmarks

Booleans::isBoolean(char[], int, int)

Each operation represents parsing the following values: ["11", "00", "sdfsdfsf", "F", "T", "on", "off", "yes", "no", "0", "1", "True", "False", "true", "false", ""]. The new method is 6.73x faster and has zero allocations.

Benchmark                                                     Mode  Cnt         Score   Error   Units
BooleansBenchmark.baseline                                   thrpt       12217820.930           ops/s
BooleansBenchmark.baseline:·gc.alloc.rate                    thrpt           3812.499          MB/sec
BooleansBenchmark.baseline:·gc.alloc.rate.norm               thrpt            360.036            B/op
BooleansBenchmark.baseline:·gc.churn.G1_Eden_Space           thrpt           3797.707          MB/sec
BooleansBenchmark.baseline:·gc.churn.G1_Eden_Space.norm      thrpt            358.639            B/op
BooleansBenchmark.baseline:·gc.churn.G1_Survivor_Space       thrpt              0.014          MB/sec
BooleansBenchmark.baseline:·gc.churn.G1_Survivor_Space.norm  thrpt              0.001            B/op
BooleansBenchmark.baseline:·gc.count                         thrpt             95.000          counts
BooleansBenchmark.baseline:·gc.time                          thrpt             63.000              ms
BooleansBenchmark.candidate                                  thrpt       82278826.155           ops/s
BooleansBenchmark.candidate:·gc.alloc.rate                   thrpt             ≈ 10⁻⁴          MB/sec
BooleansBenchmark.candidate:·gc.alloc.rate.norm              thrpt             ≈ 10⁻⁶            B/op
BooleansBenchmark.candidate:·gc.count                        thrpt                ≈ 0          counts

Booleans::parseBoolean(char[], int, int, bool)

Each operation represents parsing the following values: ["true", "false", ""]. The new method is 6.93x faster and has zero allocations for valid inputs. For invalid inputs, it will perform some allocations to throw an exception (same as before).

Benchmark                                                     Mode  Cnt          Score   Error   Units
BooleansBenchmark.baseline                                   thrpt        40693577.768           ops/s
BooleansBenchmark.baseline:·gc.alloc.rate                    thrpt            3386.168          MB/sec
BooleansBenchmark.baseline:·gc.alloc.rate.norm               thrpt              96.010            B/op
BooleansBenchmark.baseline:·gc.churn.G1_Eden_Space           thrpt            3396.792          MB/sec
BooleansBenchmark.baseline:·gc.churn.G1_Eden_Space.norm      thrpt              96.311            B/op
BooleansBenchmark.baseline:·gc.churn.G1_Survivor_Space       thrpt               0.018          MB/sec
BooleansBenchmark.baseline:·gc.churn.G1_Survivor_Space.norm  thrpt               0.001            B/op
BooleansBenchmark.baseline:·gc.count                         thrpt              85.000          counts
BooleansBenchmark.baseline:·gc.time                          thrpt              57.000              ms
BooleansBenchmark.candidate                                  thrpt       281913527.179           ops/s
BooleansBenchmark.candidate:·gc.alloc.rate                   thrpt              ≈ 10⁻⁴          MB/sec
BooleansBenchmark.candidate:·gc.alloc.rate.norm              thrpt              ≈ 10⁻⁷            B/op
BooleansBenchmark.candidate:·gc.count                        thrpt                 ≈ 0          counts

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Nov 22, 2023

Compatibility status:

Checks if related components are compatible with change 4aec65c

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git]

Copy link
Contributor

❕ Gradle check result for 569d8bb: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Nov 22, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (72e63f2) 71.38% compared to head (4aec65c) 71.31%.
Report is 1 commits behind head on main.

Files Patch % Lines
.../src/main/java/org/opensearch/common/Booleans.java 76.19% 0 Missing and 5 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11308      +/-   ##
============================================
- Coverage     71.38%   71.31%   -0.08%     
+ Complexity    59007    58903     -104     
============================================
  Files          4890     4890              
  Lines        277432   277408      -24     
  Branches      40313    40298      -15     
============================================
- Hits         198056   197829     -227     
- Misses        62913    63120     +207     
+ Partials      16463    16459       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ketanv3 ketanv3 force-pushed the performance/boolean-parsing branch from 569d8bb to 714f320 Compare November 23, 2023 07:31
Copy link
Contributor

❕ Gradle check result for 714f320: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testRestartPrimary_NoReplicas

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: Ketan Verma <ketan9495@gmail.com>
@ketanv3 ketanv3 force-pushed the performance/boolean-parsing branch from 714f320 to 4aec65c Compare November 23, 2023 08:43
Copy link
Contributor

❕ Gradle check result for 4aec65c: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing checks are inspected and point to the corresponding known issue(s)

Could you please investigate the repeated build failures and come back with updates on the issues that impacted this PR?

CHANGELOG.md Show resolved Hide resolved
@peternied peternied self-assigned this Nov 25, 2023
@ketanv3
Copy link
Contributor Author

ketanv3 commented Nov 26, 2023

@peternied

Could you please investigate the repeated build failures and come back with updates on the issues that impacted this PR?

@peternied peternied merged commit 26a1439 into opensearch-project:main Nov 27, 2023
116 checks passed
@peternied peternied added the backport 2.x Backport to 2.x branch label Nov 27, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 27, 2023
Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
(cherry picked from commit 26a1439)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross pushed a commit that referenced this pull request Nov 30, 2023
Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
(cherry picked from commit 26a1439)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
kotwanikunal pushed a commit that referenced this pull request Nov 30, 2023
(cherry picked from commit 26a1439)

Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
fahadshamiinsta pushed a commit to fahadshamiinsta/OpenSearch270 that referenced this pull request Dec 4, 2023
Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Dec 11, 2023
Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Signed-off-by: Ketan Verma <ketan9495@gmail.com>
Signed-off-by: Ketan Verma <9292653+ketanv3@users.noreply.github.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants