Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ECS in operator logs #2002

Closed
pebrc opened this issue Oct 16, 2019 · 6 comments
Closed

Support ECS in operator logs #2002

pebrc opened this issue Oct 16, 2019 · 6 comments
Assignees
Labels
discuss We need to figure this out >enhancement Enhancement of existing functionality

Comments

@pebrc
Copy link
Collaborator

pebrc commented Oct 16, 2019

Currently our logging output does not play nicely with Elastic observability tooling in Kibana for example because we don't follow ECS in the JSON logs we produce.

Example: ECK log output not fully parsed in Kibana logs view:
Screenshot 2019-10-16 at 16 55 58

There are two options to solve this:

  1. transformation on ingest to conform to ECS
  2. adjust our logging format to produce ECS compliant JSON

\2. seems easier as we have full control over what we produce and it would simplify ingestion (no transformation needed at ingest)

@pebrc pebrc added >enhancement Enhancement of existing functionality discuss We need to figure this out labels Oct 16, 2019
@anyasabo
Copy link
Contributor

How do things like name and namespace map to ECS? Or does it make sense to map the things we can (looks like level, timestamp, logger, and message at least), and then leave the rest as is?

@pebrc pebrc mentioned this issue Oct 24, 2019
3 tasks
@pebrc
Copy link
Collaborator Author

pebrc commented Nov 12, 2019

How do things like name and namespace map to ECS?

As they don't map to any ECS fields they would just stay in their original form.

@barkbay barkbay self-assigned this Jan 18, 2020
@barkbay
Copy link
Contributor

barkbay commented Jan 20, 2020

Just to share an issue I'm facing:

In case of an error in ECK the following log is generated by the controller runtime:

{
	"log": "{\"log.level\":\"error\",\"@timestamp\":\"2020-01-20T13:53:19.874Z\",\"log.logger\":\"controller-runtime.controller\",\"message\":\"Reconciler error\",\"service.version\":\"1.0.0-bf9563a3\",\"service.type\":\"eck\",\"ecs.version\":\"1.4.0\",\"controller\":\"webhook-certificates-controller\",\"request\":\"/elastic-webhook.k8s.elastic.co\",\"error\":\"validatingwebhookconfigurations.admissionregistration.k8s.io \\\"elastic-webhook.k8s.elastic.co\\\" not found\",\"errorCauses\":[{\"error\":\"validatingwebhookconfigurations.admissionregistration.k8s.io \\\"elastic-webhook.k8s.elastic.co\\\" not found\"}],\"error.stack_trace\":\"github.com/go-logr/zapr.(*zapLogger).Error\\n\\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\\n\\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\\n\\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:152\\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\\n\\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:153\\nk8s.io/apimachinery/pkg/util/wait.Until\\n\\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:88\"}\n",
	"stream": "stderr",
	"time": "2020-01-20T13:53:19.874360282Z"
}

The problem is that there is an error field and IIUC it can't be indexed because error is expected to be a document with ECS:

{"type":"mapper_parsing_exception","reason":"object mapping for [error] tried to parse field [error] as object, but found a concrete value"}

I have tried to use the rename processor to rename it to error.message but with no luck for the moment, I'm not sure that the processors can be applied to an inner field.

I'm still trying to understand how to circumvent this situation...

@barkbay
Copy link
Contributor

barkbay commented Jan 21, 2020

I finally managed to rename to error field to error.message, huge thanks to @andrewkroh for its help !

image

@barkbay
Copy link
Contributor

barkbay commented Jan 21, 2020

I think I will create a first PR to add a minimal set of fields to be compliant with ECS.

But there are some things that could be investigated/improved:

  1. Should we consider a "reconciliation loop" as an event?
  • event.sequence looks like our iteration field (number of times a controller has run its Reconcile method)
  • event.duration looks like our current took
  • event.type could be [ change| creation | deletion ]
  • event.dataset could help to separate the logs from the different controller (elasticsearch-controller, kibana-controller...)
  1. The presence of some fields is inconsistent

It is worth to note that it is not easy to be consistent re. the key/values added to the log.
The most problematic case is when an error is managed by the controller itself (i.e. not by our code) :

if result, err := c.Do.Reconcile(req); err != nil {
   c.Queue.AddRateLimited(req)
   log.Error(err, "Reconciler error", "controller", c.Name, "request", req)
   [...]
   return false
}

This would produce:

{
	"message": "Reconciler error",
	"log.logger": "controller-runtime.controller",
	"log.level": "error",
	"controller": "kibana-association-controller",
	"error.message": "An error message about kibanas.kibana.k8s.elastic.co \"kb-apm-sample\": something bad happened",
	"error.stack_trace": "github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191028221656-72ed19daf4bb/pkg/util/wait/wait.go:88",
	"request": "namespace1/kb-apm-sample"
}

In the above example if a user is filtering on namespace: "namespace1" and kibana_name: "kb-apm-sample" then the error will not be displayed.
We could propose to the controller-runtime project to replace "request": "namespace1/kb-apm-sample" by something like:

log.Error(err, "Reconciler error", "controller", c.Name, "namespace", req.Namespace, "name", req.name)

(we would still have a problem with the "name" field but this could be revisited in our code)
Or add a processor or our own encoder to transform the output.

We also have some inconsistencies in our code, mostly in the "common" packages, for example when a certificate is generated:

log.Info("Creating HTTP internal certificate secret", "namespace", secret.Namespace, "secret_name", secret.Name)

I think it would be nice to add a reference to the owner in order to link this log to the parent resource.

@barkbay
Copy link
Contributor

barkbay commented Jan 27, 2020

Closed in favor of #2467

@barkbay barkbay closed this as completed Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss We need to figure this out >enhancement Enhancement of existing functionality
Projects
None yet
Development

No branches or pull requests

3 participants