Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout while producing message #173

Closed
sschepens opened this issue Jan 25, 2017 · 4 comments
Closed

Timeout while producing message #173

sschepens opened this issue Jan 25, 2017 · 4 comments
Labels
deprecated/question Questions should happened in GitHub Discussions

Comments

@sschepens
Copy link
Contributor

Last night we experienced a couple of timeouts (1s) producing messages to broker, we found that at that exact time the broker triggered a ledger close. Can these two events be related?

Producer logs:

[log_time:08:49:16.219] [thread:pulsar-timer-7-1] [level:INFO ] [logger:ProducerImpl] - [persistent://fury/global/bf8dee5d854a4267a4b8ba7546d7d5d5-mediations-tasks/bf8dee5d854a4267a4b8ba7546d7d5d5-mediations-tasks-partition-5] [default-cluster1-4-3532] Message send timed out. Failing 1 messages

java.util.concurrent.CompletionException: com.yahoo.pulsar.client.api.PulsarClientException$TimeoutException: Could not send message to broker within given timeout
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
	at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
	at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
	at com.yahoo.pulsar.client.impl.ProducerImpl$1.sendComplete(ProducerImpl.java:152)
	at com.yahoo.pulsar.client.impl.ProducerImpl.lambda$failPendingMessages$6(ProducerImpl.java:932)
	at java.lang.Iterable.forEach(Iterable.java:75)
	at com.yahoo.pulsar.client.impl.ProducerImpl.failPendingMessages(ProducerImpl.java:927)
	at com.yahoo.pulsar.client.impl.ProducerImpl.lambda$failPendingMessages$7(ProducerImpl.java:950)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:312)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	at java.lang.Thread.run(Thread.java:745)
Caused by: com.yahoo.pulsar.client.api.PulsarClientException$TimeoutException: Could not send message to broker within given timeout
	at com.yahoo.pulsar.client.impl.ProducerImpl.run(ProducerImpl.java:904)
	at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:588)
	at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:662)
	at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:385)
	... 2 more

Broker logs:

2017-01-25 08:49:15,142 - INFO  - [bookkeeper-ml-workers-36-1:OpAddEntry@155] - [fury/global/bf8dee5d854a4267a4b8ba7546d7d5d5-mediations-tasks/persistent/bf8dee5d854a4267a4b8ba7546d7d5d5-mediations-tasks-partition-5] Closing ledger 6999 for being full

2017-01-25 08:50:19,939 - INFO  - [main-EventThread:ManagedLedgerImpl@937] - [fury/global/bf8dee5d854a4267a4b8ba7546d7d5d5-mediations-tasks/persistent/bf8dee5d854a4267a4b8ba7546d7d5d5-mediations-tasks-partition-5] Created new ledger 7740

What's also curious is that there's a one minute difference between ledger close and new open events.

@sschepens
Copy link
Contributor Author

@merlimat any idea about this?
I realized brokers use a couple of ThreadPools to synchronously execute tasks, such as ledger rolling and other stuff, could this be related? Maybe with many topics and consumers, and many rolling tasks those threadpools are causing some waiting?

sijie pushed a commit to sijie/pulsar that referenced this issue Mar 4, 2018
@ivankelly
Copy link
Contributor

@sschepens This could well be down to zookeeper being overloaded. Have you experienced this issue again?

@ivankelly ivankelly added triage/week-35 deprecated/question Questions should happened in GitHub Discussions labels Aug 30, 2018
@ivankelly
Copy link
Contributor

Closing since there's been no update in over 18 months.

@pushkarsawant
Copy link

Hi,

We are experiencing same issue with 2.6.0. We are seeing intermittent timeouts while producing message. It's logged as a WARN.

In the broker logs, there are no errors. Only Info message regarding closing existing ledger and opening a new ledger. In our case the closing and opening message is about 2 minutes apart.

hangc0276 pushed a commit to hangc0276/pulsar that referenced this issue May 26, 2021
support create topic with configed partition number.  Fix apache#173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecated/question Questions should happened in GitHub Discussions
Projects
None yet
Development

No branches or pull requests

3 participants