Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] ManyShardsIT testConcurrentQueries failing #103128

Closed
kkrik-es opened this issue Dec 7, 2023 · 3 comments · Fixed by #103159
Closed

[CI] ManyShardsIT testConcurrentQueries failing #103128

kkrik-es opened this issue Dec 7, 2023 · 3 comments · Fixed by #103159
Assignees
Labels
:Analytics/ES|QL AKA ESQL medium-risk An open issue or test failure that is a medium risk to future releases Team:QL (Deprecated) Meta label for query languages team >test-failure Triaged test failures from CI

Comments

@kkrik-es
Copy link
Contributor

kkrik-es commented Dec 7, 2023

Build scan:
https://gradle-enterprise.elastic.co/s/qmkz55xcjp3qw/tests/:x-pack:plugin:esql:internalClusterTest/org.elasticsearch.xpack.esql.action.ManyShardsIT/testConcurrentQueries

Reproduction line:

./gradlew ':x-pack:plugin:esql:internalClusterTest' --tests "org.elasticsearch.xpack.esql.action.ManyShardsIT.testConcurrentQueries" -Dtests.seed=E5341DA56A124E35 -Dtests.locale=und -Dtests.timezone=America/Godthab -Druntime.java=17

Applicable branches:
8.11

Reproduces locally?:
Didn't try

Failure history:
Failure dashboard for org.elasticsearch.xpack.esql.action.ManyShardsIT#testConcurrentQueries

Failure excerpt:

com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=87, name=Thread-24, state=RUNNABLE, group=TGRP-ManyShardsIT]

  at __randomizedtesting.SeedInfo.seed([E5341DA56A124E35:A7AFFE80EA753B14]:0)

  Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of TimedRunnable{original=processing of [1485][indices:data/read/esql/compute]: org.elasticsearch.compute.operator.DriverTaskRunner$DriverRequest/zV9Lgz8lTT6s6HTfJnyT2w:616, creationTimeNanos=1077295067581, startTimeNanos=0, finishTimeNanos=-1, failedOrRejected=false} on TaskExecutionTimeTrackingEsThreadPoolExecutor[name = node_s0/esql, queue capacity = 1000, task execution EWMA = 1.7ms, total task execution time = 1.4s, org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@16067837[Running, pool size = 1, active threads = 1, queued tasks = 1000, completed tasks = 139]]

    at __randomizedtesting.SeedInfo.seed([E5341DA56A124E35]:0)
    at org.elasticsearch.common.util.concurrent.EsRejectedExecutionHandler.newRejectedException(EsRejectedExecutionHandler.java:40)
    at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:34)
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1365)
    at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:72)
    at org.elasticsearch.transport.TransportService.sendLocalRequest(TransportService.java:1024)
    at org.elasticsearch.transport.TransportService$3.sendRequest(TransportService.java:142)
    at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:951)
    at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:846)
    at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:910)
    at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:887)
    at org.elasticsearch.compute.operator.DriverTaskRunner$1.start(DriverTaskRunner.java:49)
    at org.elasticsearch.compute.operator.DriverRunner.runToCompletion(DriverRunner.java:90)
    at org.elasticsearch.compute.operator.DriverTaskRunner.executeDrivers(DriverTaskRunner.java:65)
    at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:292)
    at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestHandler.lambda$messageReceived$5(ComputeService.java:458)
    at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:177)
    at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:305)
    at org.elasticsearch.xpack.esql.plugin.ComputeService.lambda$acquireSearchContexts$15(ComputeService.java:319)
    at org.elasticsearch.index.shard.IndexShard.ensureShardSearchActive(IndexShard.java:3932)
    at org.elasticsearch.xpack.esql.plugin.ComputeService.acquireSearchContexts(ComputeService.java:317)
    at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestHandler.messageReceived(ComputeService.java:456)
    at org.elasticsearch.xpack.esql.plugin.ComputeService$DataNodeRequestHandler.messageReceived(ComputeService.java:448)
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
    at org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:288)
    at org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:301)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.lang.Thread.run(Thread.java:833)

@kkrik-es kkrik-es added :Analytics/Compute Engine Analytics in ES|QL >test-failure Triaged test failures from CI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Dec 7, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@kkrik-es kkrik-es added Team:QL (Deprecated) Meta label for query languages team :Analytics/ES|QL AKA ESQL and removed Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/Compute Engine Analytics in ES|QL labels Dec 7, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@dnhatn dnhatn self-assigned this Dec 7, 2023
@dnhatn dnhatn added medium-risk An open issue or test failure that is a medium risk to future releases and removed blocker labels Dec 7, 2023
dnhatn added a commit that referenced this issue Dec 12, 2023
This test failed during testing with a single CPU. To prevent test 
failure in such cases, we should lower the concurrency level, ensuring
that it doesn't spawn more than 1000 tasks.

Closes #103128
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Dec 12, 2023
This test failed during testing with a single CPU. To prevent test 
failure in such cases, we should lower the concurrency level, ensuring
that it doesn't spawn more than 1000 tasks.

Closes elastic#103128
dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Dec 12, 2023
This test failed during testing with a single CPU. To prevent test 
failure in such cases, we should lower the concurrency level, ensuring
that it doesn't spawn more than 1000 tasks.

Closes elastic#103128
elasticsearchmachine pushed a commit that referenced this issue Dec 12, 2023
This test failed during testing with a single CPU. To prevent test 
failure in such cases, we should lower the concurrency level, ensuring
that it doesn't spawn more than 1000 tasks.

Closes #103128
elasticsearchmachine pushed a commit that referenced this issue Dec 12, 2023
This test failed during testing with a single CPU. To prevent test 
failure in such cases, we should lower the concurrency level, ensuring
that it doesn't spawn more than 1000 tasks.

Closes #103128

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL medium-risk An open issue or test failure that is a medium risk to future releases Team:QL (Deprecated) Meta label for query languages team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants