Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to apply search_backpressure cluster settings #16432

Closed
andrross opened this issue Oct 22, 2024 · 3 comments
Closed

[BUG] Unable to apply search_backpressure cluster settings #16432

andrross opened this issue Oct 22, 2024 · 3 comments
Labels
bug Something isn't working Search Search query, autocomplete ...etc untriaged

Comments

@andrross
Copy link
Member

Describe the bug

Attempting to apply search_backpressure.search_shard_task.cancellation_burst or search_backpressure.search_task.cancellation_burst settings results in the cluster manager getting stuck in a loop emitting an error that says "java.lang.IllegalArgumentException: rate must be greater than zero".

Related component

Search

To Reproduce

Download the OpenSearch 2.17 min distribution tarball, extract it, then launch the process with ./bin/opensearch. Send the following command:

PUT {{endpoint}}/_cluster/settings
{
    "persistent" :{
        "search_backpressure.search_shard_task.cancellation_burst": 5.0,
        "search_backpressure.search_task.cancellation_burst": 5.0
    },
    "transient" :{
        "search_backpressure.search_shard_task.cancellation_burst": 5.0,
        "search_backpressure.search_task.cancellation_burst": 5.0
    }
}

The cluster manager gets stuck in a loop continually printing the following error:

[2024-10-22T17:04:48,028][WARN ][o.o.c.s.ClusterApplierService] [ip-172-31-30-74] failed to apply updated cluster state in [0s]:
version [12], uuid [Ijs3Brl0S-iEtMlifjtcmA], source [becoming candidate: clusterApplier#onNewClusterState]
org.opensearch.OpenSearchException: java.lang.IllegalArgumentException: rate must be greater than zero
        at org.opensearch.ExceptionsHelper.maybeThrowRuntimeAndSuppress(ExceptionsHelper.java:209) ~[opensearch-core-2.17.0.jar:2.17.0]                                                                                       at org.opensearch.search.backpressure.settings.SearchShardTaskSettings.notifyListeners(SearchShardTaskSettings.java:270) ~[opensearch-2.17.0.jar:2.17.0]                                                              at org.opensearch.search.backpressure.settings.SearchShardTaskSettings.setCancellationBurst(SearchShardTaskSettings.java:252) ~[opensearch-2.17.0.jar:2.17.0]                                                         at org.opensearch.common.settings.Setting$Updater.apply(Setting.java:1257) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.common.settings.AbstractScopedSettings$SettingUpdater.lambda$updater$0(AbstractScopedSettings.java:696) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.common.settings.AbstractScopedSettings.applySettings(AbstractScopedSettings.java:232) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:575) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:503) [opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:205) [opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:946) [opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:283) [opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:246) [opensearch-2.17.0.jar:2.17.0]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]                                                                                                                    at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]                                                                                                                                                     Caused by: java.lang.IllegalArgumentException: rate must be greater than zero                                                                                                                                                 at org.opensearch.common.util.TokenBucket.<init>(TokenBucket.java:52) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.common.util.TokenBucket.<init>(TokenBucket.java:47) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.search.backpressure.SearchBackpressureState.onRateChanged(SearchBackpressureState.java:95) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.search.backpressure.SearchBackpressureState.onBurstChanged(SearchBackpressureState.java:101) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.search.backpressure.settings.SearchShardTaskSettings.lambda$setCancellationBurst$2(SearchShardTaskSettings.java:252) ~[opensearch-2.17.0.jar:2.17.0]
        at org.opensearch.search.backpressure.settings.SearchShardTaskSettings.notifyListeners(SearchShardTaskSettings.java:264) ~[opensearch-2.17.0.jar:2.17.0]
        ... 13 more

Expected behavior

Settings are applied without error.

Additional Details

No response

@andrross andrross added bug Something isn't working untriaged labels Oct 22, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Oct 22, 2024
@andrross
Copy link
Member Author

When attempting to reproduce this in a dev environment using ./gradlew run, it succeeds on the main branch. However, it fails on the 2.x branch:

» [2024-10-22T17:46:55,088][INFO ][o.o.c.s.ClusterSettings  ] [runTask-0] updating [search_backpressure.search_shard_task.cancellation_burst] from [10.0] to [5.0]
» [2024-10-22T17:46:55.523192795Z] [BUILD] Stopping node

=== Standard error of node `node{::runTask-0}` ===
»   ↓ last 40 non error or warning messages from /home/ubuntu/OpenSearch-2.x/build/testclusters/runTask-0/logs/opensearch.stderr.log ↓
» WARNING: Using incubator modules: jdk.incubator.vector
»  WARNING: A terminally deprecated method in java.lang.System has been called
»  WARNING: System::setSecurityManager has been called by org.opensearch.bootstrap.OpenSearch (file:/home/ubuntu/OpenSearch-2.x/distribution/archives/linux-arm64-tar/build/install/opensearch-2.18.0-SNAPSHOT/lib/opensearch-2.18.0-SNAPSHOT.jar)
»  WARNING: Please consider reporting this to the maintainers of org.opensearch.bootstrap.OpenSearch
»  WARNING: System::setSecurityManager will be removed in a future release
»  Oct 22, 2024 5:46:43 PM sun.util.locale.provider.LocaleProviderAdapter <clinit>
»  WARNING: COMPAT locale provider will be removed in a future release
»  WARNING: A terminally deprecated method in java.lang.System has been called
»  WARNING: System::setSecurityManager has been called by org.opensearch.bootstrap.Security (file:/home/ubuntu/OpenSearch-2.x/distribution/archives/linux-arm64-tar/build/install/opensearch-2.18.0-SNAPSHOT/lib/opensearch-2.18.0-SNAPSHOT.jar)
»  WARNING: Please consider reporting this to the maintainers of org.opensearch.bootstrap.Security
»  WARNING: System::setSecurityManager will be removed in a future release
»  fatal error in thread [opensearch[runTask-0][clusterApplierService#updateTask][T#1]], exiting
»  java.lang.AssertionError
»       at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:542)
»       at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:205)
»       at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:946)
»       at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:283)
»       at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:246)
»       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
»       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
»       at java.base/java.lang.Thread.run(Thread.java:1583)

@andrross
Copy link
Member Author

I think this issue was fixed by #15501

It looks like it has been backported but I think I was able to reproduce the problem on the 2.x branch as of today.

@andrross
Copy link
Member Author

I had a stale commit on the 2.x branch I was testing against. Confirmed #15501 does in fact fix this issue. Resolving.

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Search Project Board Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search Search query, autocomplete ...etc untriaged
Projects
Archived in project
Development

No branches or pull requests

1 participant