Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add out_kafka output #94

Closed
salekseev opened this issue Aug 2, 2016 · 57 comments
Closed

Add out_kafka output #94

salekseev opened this issue Aug 2, 2016 · 57 comments
Assignees

Comments

@salekseev
Copy link

salekseev commented Aug 2, 2016

Could be based on https://github.com/edenhill/librdkafka, payload should be JSON or MessagePack.

@edsiper edsiper self-assigned this Aug 2, 2016
@edsiper
Copy link
Member

edsiper commented Aug 2, 2016

Added to the to-do list. thanks

@salekseev salekseev changed the title Please add out_kafka output Add out_kafka output Aug 3, 2016
@panda87
Copy link

panda87 commented May 7, 2017

Kafka output is very useful feature since in our case every log coming through kafka and then reroute to it's stateful service like elasticsearch / hadoop / other real time streaming.
Actually this is the only feature im waiting for it to use FluentBit since I want to use Docker with fluentd log driver, and then forward it to kafka.

@edsiper do you may have eta when you will be able to start working on it?

@edsiper
Copy link
Member

edsiper commented May 8, 2017

@panda87 between Q2/Q3

@solsson
Copy link

solsson commented Jun 14, 2017

Is there ongoing work with this? There is a https://github.com/samsung-cnct/fluent-bit-kafka-output-plugin but it needs more work, configurability in particular. If a native impl is on its way I could try to contribute there instead.

@edsiper
Copy link
Member

edsiper commented Jun 14, 2017

@solsson not implemented yet, as said, between Q2/Q3.

@solsson
Copy link

solsson commented Jun 15, 2017

Yes I read so and interpreted it as 2017-06-30T23:59:59 :) So I figured there's some source somewhere. But I will give the go impl a chance.

@edsiper
Copy link
Member

edsiper commented Jul 24, 2017

FYI:

On GIT Master I've pushed a new Kafka REST output plugin.

The plugin is still under development but functional, any feedback is welcome. You can find a configuration example here

@panda87
Copy link

panda87 commented Jul 24, 2017

Thanks @edsiper
What are the benefits to work against Kafka REST instead of the solid TCP?

@solsson
Copy link

solsson commented Jul 24, 2017

On GIT Master I've pushed a new Kafka REST output plugin.

Interesting. I'll give it a try once there's a new build of fluent/fluent-bit:0.12-dev.

@edsiper
Copy link
Member

edsiper commented Jul 24, 2017

@panda87 : both plugins will be implemented: kafka-rest and kafka (native TCP), this is a snapshot of kafka-rest

@solsson fluent/fluent-bit:0.12-dev it's under build process (it should take ~20 min)

@panda87
Copy link

panda87 commented Jul 24, 2017

Thanks @edsiper

solsson added a commit to Yolean/fluent-bit-kubernetes-logging that referenced this issue Jul 25, 2017
but unsuccessfuly. No trace of connections to kafka.

See fluent/fluent-bit#94
@solsson
Copy link

solsson commented Jul 25, 2017

I quickly tested this as a branch from https://github.com/fluent/fluent-bit-kubernetes-daemonset/, with Yolean/kubernetes-kafka#45, but no luck. Logs only say:

[2017/07/25 06:10:33] [ info] [engine] started
[2017/07/25 06:10:33] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
[2017/07/25 06:10:33] [ info] [filter_kube] local POD info OK
[2017/07/25 06:10:33] [ info] [filter_kube] testing connectivity with API server...
[2017/07/25 06:10:33] [ info] [filter_kube] API server connectivity OK

No trace of connections to kafka rest. I don't have time to dig into this now. Maybe it has to do with the upgrade from 0.11 to 0.12.

Edit: The plugin works just fine. I had two errors that obscured each other, and because no logs were read I got no error message in the container's stdout. Diff documented as Yolean/fluent-bit-kubernetes-logging#1.

@solsson
Copy link

solsson commented Jul 25, 2017

I get a lot of logs in Kafka from the fluent-bit container itself saying

[ warn] [out_kafka_rest] http_do=-1
[error] [http_client] broken connection to rest.kafka.svc.cluster.local:80

correctly logged (in the Kafka sense) as

{"@timestamp":"2017-07-25T11:47:39.53173578Z","_fluent-tag":"kube.var.log.containers.fluent-bit-3tk62_kube-system_fluent-bit-9524b0051905794bc305d80567fb157524ad1a888ee76c07f6d9135ae1a2bc1b.log","log":"[2017/07/25 11:47:39] [ warn] [out_kafka_rest] http_do=-1\\n","stream":"stderr","time":"2017-07-25T11:47:39.053173578Z","kubernetes":{"pod_name":"fluent-bit-3tk62","namespace_name":"kube-system","container_name":"fluent-bit","docker_id":"9524b0051905794bc305d80567fb157524ad1a888ee76c07f6d9135ae1a2bc1b","pod_id":"ad6edb72-712e-11e7-9508-080027faebd3","labels":{"controller-revision-hash":"2302797867","k8s-app":"fluent-bit-logging","kubernetes.io/cluster-service":"true","pod-template-generation":"1","version":"v1"},"annotations":{"kubernetes.io/created-by":"{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"DaemonSet\\\",\\\"namespace\\\":\\\"kube-system\\\",\\\"name\\\":\\\"fluent-bit\\\",\\\"uid\\\":\\\"ad6ab493-712e-11e7-9508-080027faebd3\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"782159\\\"}}\\n"}}}
{"@timestamp":"2017-07-25T11:47:39.54767537Z","_fluent-tag":"kube.var.log.containers.fluent-bit-3tk62_kube-system_fluent-bit-9524b0051905794bc305d80567fb157524ad1a888ee76c07f6d9135ae1a2bc1b.log","log":"[2017/07/25 11:47:39] [error] [http_client] broken connection to rest.kafka.svc.cluster.local:80 ?\\n","stream":"stderr","time":"2017-07-25T11:47:39.054767537Z","kubernetes":{"pod_name":"fluent-bit-3tk62","namespace_name":"kube-system","container_name":"fluent-bit","docker_id":"9524b0051905794bc305d80567fb157524ad1a888ee76c07f6d9135ae1a2bc1b","pod_id":"ad6edb72-712e-11e7-9508-080027faebd3","labels":{"controller-revision-hash":"2302797867","k8s-app":"fluent-bit-logging","kubernetes.io/cluster-service":"true","pod-template-generation":"1","version":"v1"},"annotations":{"kubernetes.io/created-by":"{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"DaemonSet\\\",\\\"namespace\\\":\\\"kube-system\\\",\\\"name\\\":\\\"fluent-bit\\\",\\\"uid\\\":\\\"ad6ab493-712e-11e7-9508-080027faebd3\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"782159\\\"}}\\n"}}}

@edsiper Do you think this is an issue with my configuration, or a non-error being logged by the output plugin?

@edsiper
Copy link
Member

edsiper commented Jul 25, 2017

@solsson the error is that Fluent Bit is not able to connect to rest.kafka.svc.cluster.local TCP port 80

@edsiper
Copy link
Member

edsiper commented Jul 26, 2017

@solsson
Copy link

solsson commented Oct 2, 2017

What's the status on out_kafka (native)? Alternatively, what's the status on fluent-bit-go wrt 0.12?

@solsson
Copy link

solsson commented Oct 6, 2017

I saw fluent/fluent-bit-go#6 (comment) now.

@StevenACoffman
Copy link
Contributor

For what it is worth, we would really like to produce directly to kafka instances without the REST proxy.

@solsson
Copy link

solsson commented Nov 8, 2017

Yeah, me too, feels like the extra layer is both a performance and a reliability risk. In the meantime I've been experimenting with filebeat's kafka support, it's on the broker level, though memory footprint looks like 10x that of fluent-bit: Yolean/kubernetes-kafka#88

As a fun experiment I've also experimented with tail -f + kafkacat, which runs on a tenth of filebeat's memory but of course doesn't remember the position and can't add k8s metadata.

Didn't have the time to try more fluent-bit-go, and from the history of it I'd have to port such code for new fluent-bit minor releases.

@StevenACoffman
Copy link
Contributor

The reliability is our main concern. If we have an init container that can bootstrap the kafka cluster ip addresses, so the normal kafka producer will adjust as kafka nodes come and go. Looks like the fluent-bit-go is being maintained, and the uncertainty of this ticket is depressing maintenance of fluent-bit-kafka-output-plugin). Thanks for the pointer.

@StevenACoffman
Copy link
Contributor

StevenACoffman commented Nov 8, 2017

@solsson Did you notice this fluent/fluentd-kubernetes-daemonset#34? I was thinking of building fluentd 0.14.22 now that it's stable and comparing it's memory and cpu usage to filebeat and fluent-bit.

@solsson
Copy link

solsson commented Nov 8, 2017

@StevenACoffman Actually I didn't try fluentd, as I was so impressed with the scope and footprint of fluent-bit, and there was a kafka output scheduled for "between Q2/Q3" ... and then ofc I also needed to work on how to process those logs once they're in Kafka :)

@edsiper
Copy link
Member

edsiper commented Nov 8, 2017

Hello everyone,

Thanks for sharing your comments and interest. I wanted to let you know that the plugin is already in the development phase, so I will keep you posted about it :)

@rhysmccaig
Copy link

Great news! I would love to see this feature.

@StevenACoffman
Copy link
Contributor

@edsiper just checking in. For our planning, does it still seem likely to add the retry_logic in the near time?

@edsiper
Copy link
Member

edsiper commented Dec 13, 2017

@StevenACoffman actually I am working on that at the moment.

@edsiper
Copy link
Member

edsiper commented Dec 13, 2017

New version has been pushed:

fluent/fluent-bit-kafka-dev:0.4

Please test and send me some feedback (retry logic in place)

solsson added a commit to Yolean/fluent-bit-kubernetes-logging that referenced this issue Dec 14, 2017
solsson added a commit to Yolean/fluent-bit-kubernetes-logging that referenced this issue Dec 14, 2017
@edsiper
Copy link
Member

edsiper commented Dec 22, 2017

@StevenACoffman @solsson

any feedback on the last version provided ?

@StevenACoffman
Copy link
Contributor

@StevenACoffman
Copy link
Contributor

Sorry I haven't had a chance to set up a public repo to reliably and reproducibly test it. I will try to do that today

@StevenACoffman
Copy link
Contributor

StevenACoffman commented Jan 15, 2018

@edsiper Is there any way to pass debug flags to librdkafka? With kafkacat it's very useful to add -d broker,topic if problems occur.

I was having some issues that were internal to kafka and it was difficult to sort out whether it was the retry logic or kafka's config that was problematic.

Overall impression is that it is pretty solid, but I haven't fully explored different retry limit scenarios.

@edsiper
Copy link
Member

edsiper commented Jan 15, 2018

@solsson would you please re-submit the PR on 0.13-dev branch https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev ?

Note: I am pushing a new image and docs for Fluent Bit 0.13-dev on Kubernetes, the old kafka image will not be updated, please check for more details here:

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev

@KaneWu0
Copy link

KaneWu0 commented Jan 16, 2018

@edsiper I really want to know when release v0.13 ?

@edsiper
Copy link
Member

edsiper commented Jan 16, 2018

@wukq I would like to have it released on January 31 (I am focusing on that), but as you know everything will depends on general feedback and improvements required.

For now I will keep updating the 0.13-dev image, if more people test it, we will be more confident about the current status

solsson added a commit to Yolean/fluent-bit-kubernetes-logging that referenced this issue Jan 16, 2018
@solsson
Copy link

solsson commented Jan 16, 2018

@solsson would you please re-submit the PR on 0.13-dev branch https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev ?

@edsiper done in fluent/fluent-bit-kubernetes-logging#16

Fantastic work with the new version. I've enabled the Prometheus exporter in our QA and fluentbit_input_bytes_total is immensely useful to spot unexpectedly high volumes. Also fluentbit_output_retries_total should help us track issues with the Retry_Limit tests from fluent/fluent-bit-kubernetes-logging#11 (comment). Haven't had any retries yet though :)

@edsiper
Copy link
Member

edsiper commented Jan 30, 2018

FYI: the 0.13-dev image have been updated to 0.6:

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev (fixed)

note: this one fix a crash found in the Prometheus exporter.

@edsiper
Copy link
Member

edsiper commented Feb 6, 2018

FYI: 0.13-dev image has been updated to 0.7:

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev

changes on this version are related to fixes in memory handling, new configuration option for out_kafka to adjust rdkafka internals and in_tail fixes.

@solsson
Copy link

solsson commented Feb 6, 2018

IMO out_kafka is feature complete now.

@StevenACoffman
Copy link
Contributor

I agree. The newest improvements (notably the new configuration option for out_kafka to adjust rdkafka internals) to be able to limit kafka buffer make it a complete feature now. It has been pretty solid in my testing over the last 24 hours, but I'm going to try putting a few million messages through it, and see how it handles.

@edsiper
Copy link
Member

edsiper commented Feb 7, 2018

@solsson @StevenACoffman should I set these settings by default in the plugin ? what do you think ?:

  • rdkafka.log.connection.close false
  • rdkafka.queue.buffering.max.kbytes 10240
  • rdkafka.request.required.acks 1

note: of course they can be overridden by config any time

@StevenACoffman
Copy link
Contributor

I can see people adjusting them but Those look pretty good as defaults to me.

@solsson
Copy link

solsson commented Feb 8, 2018

I'm unsure. For users of the kubernets manifests it's quite transparent that defaults are from https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md and you see suggested overrides in plain text.

Without the sample manifests I'd say you definitely want rdkafka.log.connection.close to avoid support requests. The [error] level log entries for a non-issue fooled me for sure. Maybe the same goes for rdkafka.queue.buffering.max.kbytes, if we can confirm that it solves the memory spikes. But the value I set was just a guess. On the other hand, again for transparency this might be better placed in the output plugin's docs.

@edsiper
Copy link
Member

edsiper commented Feb 9, 2018

thanks for the feedback. I will document that in the plugin docs.

@edsiper
Copy link
Member

edsiper commented Feb 9, 2018

btw, are we ok to close this one ?

fujimotos pushed a commit to fujimotos/fluent-bit that referenced this issue Jul 22, 2019
Signed-off-by: Takahiro YAMASHITA <nokute78@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants
@salekseev @solsson @edsiper @StevenACoffman @rhysmccaig @panda87 @KaneWu0 and others