Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

License aggregation of java options fails because of trailing white space #4564

Closed
chillTschill opened this issue Jun 10, 2021 · 2 comments · Fixed by #4571
Closed

License aggregation of java options fails because of trailing white space #4564

chillTschill opened this issue Jun 10, 2021 · 2 comments · Fixed by #4571
Labels
>bug Something isn't working v1.7.0

Comments

@chillTschill
Copy link

Bug Report

What did you do?
We deployed an elasticsearch on k8s, managed by the operator v1.6.

What did you expect to see?
Successful memory/license aggregation

What did you see instead? Under which circumstances?
We saw errors in the log of the operator pod. The config map containing the report wasn't generated.

After troubleshooting this quite a bit, I could identify

var maxHeapSizeRe = regexp.MustCompile(`-Xmx([0-9]+)([gGmMkK]?)(?:\s.+|$)`)
and 'func memFromJavaOpts' as the issue.

Because other environments worked fine, we had trouble identifying the root cause of the problem. Since then I learnt that the memory aggregation uses other sources for the used memory, differentiating on what information is available --> In the environments working fine, we had the resources.limits set. If I understand it correctly, JAVA_OPTS seems to be the last resort.

Environment

  • ECK version:

      1.6.0
    
  • Kubernetes information:

    • On premise: 1.20
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:58Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

  • Resource definition:
podTemplate:
      metadata: {}
      spec:
        containers:
        - env:
          - name: ES_JAVA_OPTS
            value: '-Xms1500m -Xmx1500m '
          name: elasticsearch
          resources:
            requests:
              cpu: 1m
              memory: 3Gi

(I only included the podTemplate part of our elasticsearch resource, because I've been able to identify that as the issue)

  • Logs:
{"log.level":"error","@timestamp":"2021-06-09T14:37:34.425Z","log.logger":"resource","message":"Failed to report licensing information","service.version":"1.6.0+8326ca8a","service.type":"eck","ecs.version":"1.4.0","error":"failed to aggregate Elasticsearch memory: cannot extract max jvm heap size from -Xms1500m -Xmx1500m ","errorVerbose":"cannot extract max jvm heap size from -Xms1500m -Xmx1500m \ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.memFromJavaOpts\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:213\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.containerMemLimits\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:186\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.aggregateElasticsearchMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:66\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.AggregateMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:46\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Get\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:69\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Report\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:58\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Start\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:49\ngh.neting.cc/elastic/cloud-on-k8s/cmd/manager.asyncTasks.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/cmd/manager/main.go:612\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371\nfailed to aggregate Elasticsearch memory\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.aggregateElasticsearchMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:73\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.Aggregator.AggregateMemory\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/aggregator.go:46\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Get\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:69\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Report\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:58\ngh.neting.cc/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Start\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:49\ngh.neting.cc/elastic/cloud-on-k8s/cmd/manager.asyncTasks.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/cmd/manager/main.go:612\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371","error.stack_trace":"github.com/elastic/cloud-on-k8s/pkg/license.ResourceReporter.Start\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/license/reporter.go:51\ngh.neting.cc/elastic/cloud-on-k8s/cmd/manager.asyncTasks.func1\n\t/go/src/github.com/elastic/cloud-on-k8s/cmd/manager/main.go:612"}
@botelastic botelastic bot added the triage label Jun 10, 2021
@chillTschill
Copy link
Author

chillTschill commented Jun 10, 2021

Hi guys

First of all, I'm not quite sure whether that qualifies as a bug in your eyes. In my eyes it's a ugly error, which seems to have a quite easy fix. I would propose to switch the regex from -Xmx([0-9]+)([gGmMkK]?)(?:\s.+|$) to -Xmx([0-9]+)([gGmMkK]?) as I can't really identify the need for the non capturing group. (Anyone?) I would be happy to provide a PR.

For anyone else having this issue, for a quick fix just remove the trailing white space at the end of your JAVA_OPTS definition:

- name: ES_JAVA_OPTS
   value: '-Xms1500m -Xmx1500m '

@barkbay
Copy link
Contributor

barkbay commented Jun 15, 2021

Thanks for the report, original discussion is here: #2277 (comment)

I think we still want to fail if there are non whitespace characters immediately following the unit.

Should be fixed by #4571

@barkbay barkbay added the >bug Something isn't working label Jun 15, 2021
@botelastic botelastic bot removed the triage label Jun 15, 2021
@barkbay barkbay added the v1.7.0 label Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working v1.7.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants