Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZookeeperCacheTest gets stuck indefinitely #1338

Closed
merlimat opened this issue Mar 5, 2018 · 2 comments · Fixed by #7844
Closed

ZookeeperCacheTest gets stuck indefinitely #1338

merlimat opened this issue Mar 5, 2018 · 2 comments · Fixed by #7844

Comments

@merlimat
Copy link
Contributor

merlimat commented Mar 5, 2018

I have noticed that since few days, few tests have started to get stuck and the build gets timed out after 100 minutes. It is happening very frequently and it was not happening before, so I suspect this is related to some recent changes.

It gets stuck mostly on ZookeeperCacheTest though I have also seen in other places.

https://builds.apache.org/job/pulsar-pull-request/1941/console

[INFO] Running org.apache.pulsar.zookeeper.ZookeeperCacheTest
Build timed out (after 100 minutes). Marking the build as aborted.
...
@merlimat
Copy link
Contributor Author

merlimat commented Mar 6, 2018

Another failure, now with the test timeout:

Error Message

Method org.apache.pulsar.zookeeper.ZookeeperCacheTest.testChildrenCache() didn't finish within the time-out 10000

Stacktrace

org.testng.internal.thread.ThreadTimeoutException: Method org.apache.pulsar.zookeeper.ZookeeperCacheTest.testChildrenCache() didn't finish within the time-out 10000
	at java.util.concurrent.ThreadPoolExecutor.tryTerminate(ThreadPoolExecutor.java:713)
	at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1014)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

@merlimat
Copy link
Contributor Author

merlimat commented Mar 6, 2018

It appears this is happening after the from OrderedExecutor to OrderedScheduler. The executor is rejecting after 1 task is already queued.

16:34:52.236 [main] ERROR org.apache.pulsar.zookeeper.ZooKeeperCache - Failed to updated zk-cache /test on zk-watch Queue at limit of 1 items
java.util.concurrent.RejectedExecutionException: Queue at limit of 1 items
	at org.apache.bookkeeper.common.util.BoundedScheduledExecutorService.checkQueue(BoundedScheduledExecutorService.java:138) ~[bookkeeper-server-shaded-4.7.0-SNAPSHOT.jar:4.7.0-SNAPSHOT]
	at org.apache.bookkeeper.common.util.BoundedScheduledExecutorService.submit(BoundedScheduledExecutorService.java:94) ~[bookkeeper-server-shaded-4.7.0-SNAPSHOT.jar:4.7.0-SNAPSHOT]
	at org.apache.bookkeeper.common.util.OrderedScheduler.submitOrdered(OrderedScheduler.java:327) ~[bookkeeper-server-shaded-4.7.0-SNAPSHOT.jar:4.7.0-SNAPSHOT]
	at org.apache.pulsar.zookeeper.ZooKeeperCache.process(ZooKeeperCache.java:135) ~[classes/:?]
	at org.apache.pulsar.zookeeper.LocalZooKeeperCache.process(LocalZooKeeperCache.java:63) ~[classes/:?]
	at org.apache.pulsar.zookeeper.ZooKeeperChildrenCache.process(ZooKeeperChildrenCache.java:98) ~[classes/:?]

merlimat added a commit to merlimat/pulsar that referenced this issue Mar 6, 2018
merlimat pushed a commit that referenced this issue Mar 6, 2018
…1332)

* Move pulsar functions dependency version to root pom and remove duplicated license headers

This addresses some comments in pulsar functions PR #1314

* shade worker

* Fix broken master

* Upgrade the bookkeeper storage client dependency to the official bookkeeper version

This removes the temp dependency in `pulsar-functions-instance`

* set `protobuf2.version` in pulsar-common

* provide a shaded worker

* include worker dependency at broker

* Embeded function worker at broker

* rename 'function worker' to 'functions worker'

* add "--no-functions-worker" for pulsar-client-cpp tests

* Integrate function cli into pulsar-admin cli

- rename `pulsar-client-tools-shaded` to `pulsar-client-admin-shaded-for-functions`, because this module is used by functions only to avoid protobuf conflicts
- move protobuf3 references to Utils, so it won't be referenced out side of pulsar-functions
- integrate function cli into pulsar-admin cli

* Merge pulsar-functions dist package into pulsar binary distribution

* Fix license header issues

* Fixed ZK cache test exectutor configuration.

Fixes #1338
wolfstudy pushed a commit that referenced this issue Oct 29, 2020
…n instance class path (#7844)

Fixes #1338


### Motivation


Currently, the function worker is using the function worker's classpath to configure the function instance (runner)'s classpath. So when the broker (function worker) is using an image that is different from the function instance (runner) for kubernetes runtime, the classpath will be wrong and the function instance is not able to load the instance classes.


### Modifications

Adding an function instance class path entry into the kubernetes runtime config. And construct the function launch command accordingly.


### Verifying this change

- [X] Make sure that the change passes the CI checks.


This change is already covered by existing tests, such as KubernetesRuntimeTest.


### Does this pull request potentially affect one of the following parts:

  No

### Documentation

  - Does this pull request introduce a new feature? No


Co-authored-by: Yong Zhang <zhangyong1025.zy@gmail.com>
wolfstudy pushed a commit that referenced this issue Oct 30, 2020
…n instance class path (#7844)

Fixes #1338


### Motivation


Currently, the function worker is using the function worker's classpath to configure the function instance (runner)'s classpath. So when the broker (function worker) is using an image that is different from the function instance (runner) for kubernetes runtime, the classpath will be wrong and the function instance is not able to load the instance classes.


### Modifications

Adding an function instance class path entry into the kubernetes runtime config. And construct the function launch command accordingly.


### Verifying this change

- [X] Make sure that the change passes the CI checks.


This change is already covered by existing tests, such as KubernetesRuntimeTest.


### Does this pull request potentially affect one of the following parts:

  No

### Documentation

  - Does this pull request introduce a new feature? No


Co-authored-by: Yong Zhang <zhangyong1025.zy@gmail.com>
(cherry picked from commit 7285380)
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this issue Nov 13, 2020
…n instance class path (apache#7844)

Fixes apache#1338


### Motivation


Currently, the function worker is using the function worker's classpath to configure the function instance (runner)'s classpath. So when the broker (function worker) is using an image that is different from the function instance (runner) for kubernetes runtime, the classpath will be wrong and the function instance is not able to load the instance classes.


### Modifications

Adding an function instance class path entry into the kubernetes runtime config. And construct the function launch command accordingly.


### Verifying this change

- [X] Make sure that the change passes the CI checks.


This change is already covered by existing tests, such as KubernetesRuntimeTest.


### Does this pull request potentially affect one of the following parts:

  No

### Documentation

  - Does this pull request introduce a new feature? No


Co-authored-by: Yong Zhang <zhangyong1025.zy@gmail.com>
flowchartsman pushed a commit to flowchartsman/pulsar that referenced this issue Nov 17, 2020
…n instance class path (apache#7844)

Fixes apache#1338


### Motivation


Currently, the function worker is using the function worker's classpath to configure the function instance (runner)'s classpath. So when the broker (function worker) is using an image that is different from the function instance (runner) for kubernetes runtime, the classpath will be wrong and the function instance is not able to load the instance classes.


### Modifications

Adding an function instance class path entry into the kubernetes runtime config. And construct the function launch command accordingly.


### Verifying this change

- [X] Make sure that the change passes the CI checks.


This change is already covered by existing tests, such as KubernetesRuntimeTest.


### Does this pull request potentially affect one of the following parts:

  No

### Documentation

  - Does this pull request introduce a new feature? No


Co-authored-by: Yong Zhang <zhangyong1025.zy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant