-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JVM Settings #112
Comments
agreed :)
Do you mean in the readme? I think it will always have to be tweaked per installation based on the load etc. In #44 (comment) I'm looking at how to minimize the footprint, for testing. |
I'm actually pretty interested in this topic (memory and cpu requests/limits for java apps in kubernetes) and very interested in your opinions. Currently, I've been following advice from jsravn from his experience building a kubernetes stack from scratch. He says:
I would be very interested in other people's experience and opinions. |
Great input. The first task I had to battle with was to select a JVM vendor+version. Am I wrong to assume that the choice there affects behavior with memory flags? The default kafka image here uses Zulu because Confluent does so. I'm obviously not a lawyer but I think that anyone who builds their own images can use Oracle's server-jre. Are there other options? Is openjdk:8-jre-slim a better option than Zulu? Or should we go straight for Java9 before we start tweaking, given the improvement in kafka SSL performance? I imagine that the java community has efforts underway to tweak JVMs for use in Docker, and even with Kubernetes memory limits. Haven't researched that topic though. Now back to the JVM args... I battled with them around #49 (comment) and was quite unhappy about the outcome (hadn't seen jsravn's advice at that time). After that I didn't dare to merge anything with memory limits on broker pods. But I'm willing to un-conclude that :) |
With regards to Kafka, doing whatever Confluent does is generally the best advice. Sadly, probably best to avoid Oracle for legal reasons and pick Zulu openJDK8. JDK 9 shows great promise, and is fun for experiments, and maybe individual apps in production, but I would not run critical infrastructure like kafka in prod on it until Confluent does. I can only speak to tuning web apps in java in kubernetes, and I worry that Kafka is different enough to invalidate that experience. |
Haven't read it yet, but https://dzone.com/articles/why-my-java-application-is-oomkilled could be relevant. |
Interesting input for #112, for use with broker and zk pods in addition to KAFKA_HEAP_OPTS.
of the memory limit as "Max. Heap Size (Estimated)" Reducing limits as experiment for #112.
The results of #128 will be very interesting. There, for the I'll keep a close watch on the Prometheus metric |
@solsson You have a prometheus exporter for the image now? I had to extend you image to add one, pretty important in my setup. Thanks |
I didn't add the exporter to the image. It's a separate container in the pod, added using kubectl patch, see #128. Did you find an exporter that plugins into kafka without JMX? I did a couple of experiments based on https://github.com/arnobroekhof/kafka-http-metrics-reporter, https://github.com/prepor/kafka-prometheus and https://github.com/prometheus/client_java/blob/master/simpleclient_dropwizard but I failed to get any useful results. |
@solsson No, I currently use the JMX exporter. Is there a downside to using the JMX exporter? I'm not hugely experienced with the JVM. Thanks, On a side note, some of my consumers seem to lag behind when there is a lot of data ( 20k/sec ). However they aren't using much cpu or network. Obviously consumer lib etc could affect this, but on the actual Kafka side is there anything I should be monitoring to possible see whats slowing it down? Thanks x 2 |
@qrpike Ok, I think we're all ending up with JMX as the only viable alternative. As a sidecar it's relatively resource heavy, and before #128 I had failed to set a reasonable memory limit that didn't affect stability of the kafka pod.
This is a great topic and quite core to Kafka ops. I'm no expert but I'd like to see the community discuss this. I suggest you open a new issue with the stats you have. Please clarify if it is the cpu & network of consuming pods that is low. Also what metrics you have from Kafka pods and nodes. |
which probably makes the discussion in #112 outdated. Instead we could discuss -XX:MaxRAMPercentage and related flags from http://bugs.java.com/view_bug.do?bug_id=JDK-8186248
See f9162a6 and don't hesitate to open new discussions. |
Since using a JVM in containers is already somewhat scary, setting limits on the pods is always a good idea, however there are no mentions of this in the docs.
Adding a:
In the docs may be a good idea. JVMs tend to consume memory unless told otherwise.
The text was updated successfully, but these errors were encountered: