Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulty Parsing Logs Consistently #835

Closed
jtslear opened this issue May 31, 2018 · 2 comments
Closed

Difficulty Parsing Logs Consistently #835

jtslear opened this issue May 31, 2018 · 2 comments

Comments

@jtslear
Copy link

jtslear commented May 31, 2018

While troubleshooting various services that are a part of Cortex, one item I've found is that we are unable to easily parse are the logs. It would seem various parts of the services log in multiple ways which causes a bit of trouble as far as being able to create a reasonable logging filter that is parsable by X method of ingesting logs.

As an example, I targeted the distributor and found at least 3 ways this service put logs down (sanitized):

ts=2018-05-31T18:19:27.515467232Z caller=log.go:112 level=error org_id=0 msg="push error" err="rpc error: code = Code(400) desc = sample with repeated timestamp but different value for series container_fs_reads_total{container_name=\"example\", example_cluster=\"Example\", a_cluster=\"Example\", id=\"/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod526740fc_3e0e_11e8_b916_02000a838eab.slice/docker-b83abfed1d197426a06778e7dcd6c0470ac60e77d5f76173793fd0025929d039.scope\", image=\"docker.io/example/example@sha256:261537e58647e604701c97765973e2c4d0627953f382883f83d2d64144803cdc\", instance=\"10.11.143.157:8443\", job=\"kubernetes-apiservers\", name=\"k8s_example_example-1-lqqtq_exampleproject_526740fc-3e0e-11e8-b916-02000a838eab_0\", namespace=\"example\", pod_name=\"example-1-lqqtq\"}; last value: 1089, incoming value: 1"
ts=2018-05-30T21:12:56.673829487Z caller=log.go:112 level=error msg="error getting path" key=collectors/ring err="Unexpected response code: 500"
time="2018-05-30T19:58:18Z" level=warning msg="Is websocket request: false\nPOST /api/prom/push HTTP/1.1\r\nHost: distributor\r\nConnection: close\r\nAccept-Encoding: gzip\r\nConnection: close\r\nContent-Encoding: snappy\r\nContent-Length: 36513\r\nContent-Type: application/x-protobuf\r\nUser-Agent: Go-http-client/1.1\r\nX-Prometheus-Remote-Write-Version: 0.1.0\r\nX-Scope-Orgid: 0\r\n\r\n"

In the first two examples, the various fields are inserted when needed (this is probably a noop). My main concern is comparing one of the first two examples with the last example. It appears there's an entirely different logging mechanism all together.

This makes it a bit difficult for various systems to parse the logs thoughtfully ensuring a consistent experience when viewing the logs for troubleshooting issues.

@tomwilkie
Copy link
Contributor

Agreed @jtslear! It is a bit of a mess. I have some code that tidies-up the worst offender (request middle using a different logger to the rest of code) - I'll tidy it up now and make a PR.

@tomwilkie
Copy link
Contributor

Took a little longer, but here it is: weaveworks/common#96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants