You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using latest nightly build of Filebeat and Kafka 0.9.0.1, both on the same machine.
My setup is that Filebeat is reading logs from /var/log/messages and then publishing them to Kafka.
The whole thing is working fine for about 30mins and after that Filebeat suddenly stops being able to communicate with Kafka.
Restarting the filebeat service then fixes the issue.
Below the logs I'm getting at the moment when the problem occurs:
2016-04-18T13:20:14+03:00 DBG output worker: publish 2 events
2016-04-18T13:20:14+03:00 DBG guaranteed flag is set
2016-04-18T13:20:14+03:00 DBG publish events with attempts=-1
2016-04-18T13:20:14+03:00 DBG forwards msg with attempts=-1
2016-04-18T13:20:14+03:00 DBG message forwarded
2016-04-18T13:20:14+03:00 DBG events from worker worker queue
2016-04-18T13:20:14+03:00 DBG publish events
2016-04-18T13:20:14+03:00 WARN producer/broker/[1 859530387616] state change to [closing] because %!s(MISSING)
2016-04-18T13:20:14+03:00 WARN Closed connection to broker [172.24.33.7:9092]
2016-04-18T13:20:14+03:00 DBG Kafka publish failed with: EOF
2016-04-18T13:20:14+03:00 DBG Kafka publish failed with: EOF
2016-04-18T13:20:14+03:00 DBG finished kafka batch
2016-04-18T13:20:14+03:00 DBG handlePublishEventsResult
2016-04-18T13:20:14+03:00 DBG handle publish error: EOF
2016-04-18T13:20:14+03:00 INFO Error publishing events (retrying): EOF
After that I'm getting infinite number of repeated logs looking like that:
I've been digging through the sarama lib (third party lib used to connect to kafka). The library is supposed to automatically reconnect on failure.
At this line: 2016-04-18T13:20:14+03:00 WARN producer/broker/[1 859530387616] state change to [closing] because %!s(MISSING) The broker detected some error (unfortunately error message is not really helpful), closes the broker connection and removes the broker from list of brokers used by active client. The next time the client tries to push some events, the client will look up known/cached brokerProducer and won't find any, thus it looks up the leading broker. At this point the lib will try to reconnect to kafka.
Having no additional error messages saying otherwise it looks like the reconnect was successful. But I have no idea where the subsequent EOF's are coming from.
For how long do you wait with filebeat restart? You use TLS when connecting filebeat->kafka? Anything in Kafka logs?
I'm using latest nightly build of Filebeat and Kafka 0.9.0.1, both on the same machine.
My setup is that Filebeat is reading logs from /var/log/messages and then publishing them to Kafka.
The whole thing is working fine for about 30mins and after that Filebeat suddenly stops being able to communicate with Kafka.
Restarting the filebeat service then fixes the issue.
Below the logs I'm getting at the moment when the problem occurs:
After that I'm getting infinite number of repeated logs looking like that:
Observed on Centos 7.
Might not be so obvious to reproduce as for me it only happens in "random" moments.
Previously discussed here: https://discuss.elastic.co/t/filebeat-loses-connection-to-kafka/47660
The text was updated successfully, but these errors were encountered: