You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What did you do?
We deployed an elasticsearch on k8s, managed by the operator v1.6.
What did you expect to see?
Successful memory/license aggregation
What did you see instead? Under which circumstances?
We saw errors in the log of the operator pod. The config map containing the report wasn't generated.
After troubleshooting this quite a bit, I could identify
Because other environments worked fine, we had trouble identifying the root cause of the problem. Since then I learnt that the memory aggregation uses other sources for the used memory, differentiating on what information is available --> In the environments working fine, we had the resources.limits set. If I understand it correctly, JAVA_OPTS seems to be the last resort.
(I only included the podTemplate part of our elasticsearch resource, because I've been able to identify that as the issue)
Logs:
{"log.level":"error","@timestamp":"2021-06-09T14:37:34.425Z","log.logger":"resource","message":"Failed to report licensing information","service.version":"1.6.0+8326ca8a","service.type":"eck","ecs.version":"1.4.0","error":"failed to aggregate Elasticsearch memory: cannot extract max jvm heap size from -Xms1500m -Xmx1500m ","errorVerbose":"cannot extract max jvm heap size from -Xms1500m -Xmx1500m \ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.memFromJavaOpts\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:213\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.containerMemLimits\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:186\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.aggregateElasticsearchMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:66\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.AggregateMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:46\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Get\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:69\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Report\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:58\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Start\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:49\ngh.neting.cc/elastic/cloud-on-k8s/cmd/manager.asyncTasks.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/cmd/manager/main.go:612\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371\nfailed to aggregate Elasticsearch memory\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.aggregateElasticsearchMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:73\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.AggregateMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:46\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Get\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:69\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Report\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:58\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Start\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:49\ngh.neting.cc/elastic/cloud-on-k8s/cmd/manager.asyncTasks.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/cmd/manager/main.go:612\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371","error.stack_trace":"github.com/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Start\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:51\ngh.neting.cc/elastic/cloud-on-k8s/cmd/manager.asyncTasks.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/cmd/manager/main.go:612"}
The text was updated successfully, but these errors were encountered:
First of all, I'm not quite sure whether that qualifies as a bug in your eyes. In my eyes it's a ugly error, which seems to have a quite easy fix. I would propose to switch the regex from -Xmx([0-9]+)([gGmMkK]?)(?:\s.+|$) to -Xmx([0-9]+)([gGmMkK]?) as I can't really identify the need for the non capturing group. (Anyone?) I would be happy to provide a PR.
For anyone else having this issue, for a quick fix just remove the trailing white space at the end of your JAVA_OPTS definition:
Bug Report
What did you do?
We deployed an elasticsearch on k8s, managed by the operator v1.6.
What did you expect to see?
Successful memory/license aggregation
What did you see instead? Under which circumstances?
We saw errors in the log of the operator pod. The config map containing the report wasn't generated.
After troubleshooting this quite a bit, I could identify
cloud-on-k8s/pkg/license/aggregator.go
Line 205 in cd6ce28
Because other environments worked fine, we had trouble identifying the root cause of the problem. Since then I learnt that the memory aggregation uses other sources for the used memory, differentiating on what information is available --> In the environments working fine, we had the resources.limits set. If I understand it correctly, JAVA_OPTS seems to be the last resort.
Environment
ECK version:
Kubernetes information:
(I only included the podTemplate part of our elasticsearch resource, because I've been able to identify that as the issue)
The text was updated successfully, but these errors were encountered: