You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While troubleshooting various services that are a part of Cortex, one item I've found is that we are unable to easily parse are the logs. It would seem various parts of the services log in multiple ways which causes a bit of trouble as far as being able to create a reasonable logging filter that is parsable by X method of ingesting logs.
As an example, I targeted the distributor and found at least 3 ways this service put logs down (sanitized):
ts=2018-05-31T18:19:27.515467232Z caller=log.go:112 level=error org_id=0 msg="push error" err="rpc error: code = Code(400) desc = sample with repeated timestamp but different value for series container_fs_reads_total{container_name=\"example\", example_cluster=\"Example\", a_cluster=\"Example\", id=\"/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod526740fc_3e0e_11e8_b916_02000a838eab.slice/docker-b83abfed1d197426a06778e7dcd6c0470ac60e77d5f76173793fd0025929d039.scope\", image=\"docker.io/example/example@sha256:261537e58647e604701c97765973e2c4d0627953f382883f83d2d64144803cdc\", instance=\"10.11.143.157:8443\", job=\"kubernetes-apiservers\", name=\"k8s_example_example-1-lqqtq_exampleproject_526740fc-3e0e-11e8-b916-02000a838eab_0\", namespace=\"example\", pod_name=\"example-1-lqqtq\"}; last value: 1089, incoming value: 1"
ts=2018-05-30T21:12:56.673829487Z caller=log.go:112 level=error msg="error getting path" key=collectors/ring err="Unexpected response code: 500"
time="2018-05-30T19:58:18Z" level=warning msg="Is websocket request: false\nPOST /api/prom/push HTTP/1.1\r\nHost: distributor\r\nConnection: close\r\nAccept-Encoding: gzip\r\nConnection: close\r\nContent-Encoding: snappy\r\nContent-Length: 36513\r\nContent-Type: application/x-protobuf\r\nUser-Agent: Go-http-client/1.1\r\nX-Prometheus-Remote-Write-Version: 0.1.0\r\nX-Scope-Orgid: 0\r\n\r\n"
In the first two examples, the various fields are inserted when needed (this is probably a noop). My main concern is comparing one of the first two examples with the last example. It appears there's an entirely different logging mechanism all together.
This makes it a bit difficult for various systems to parse the logs thoughtfully ensuring a consistent experience when viewing the logs for troubleshooting issues.
The text was updated successfully, but these errors were encountered:
Agreed @jtslear! It is a bit of a mess. I have some code that tidies-up the worst offender (request middle using a different logger to the rest of code) - I'll tidy it up now and make a PR.
While troubleshooting various services that are a part of Cortex, one item I've found is that we are unable to easily parse are the logs. It would seem various parts of the services log in multiple ways which causes a bit of trouble as far as being able to create a reasonable logging filter that is parsable by X method of ingesting logs.
As an example, I targeted the distributor and found at least 3 ways this service put logs down (sanitized):
In the first two examples, the various fields are inserted when needed (this is probably a noop). My main concern is comparing one of the first two examples with the last example. It appears there's an entirely different logging mechanism all together.
This makes it a bit difficult for various systems to parse the logs thoughtfully ensuring a consistent experience when viewing the logs for troubleshooting issues.
The text was updated successfully, but these errors were encountered: