-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thanos compactor filling the disk with stack traces on Azure #2565
Comments
After some investigation it's possible to change the log level in thanos/pkg/objstore/azure/helpers.go Line 34 in c0f2dbf
but this will disable logs lower than fatal to every call we make.
After reading the code better the 404 code is being left out in the if but for some reason it's still being treated as an error. |
Are you talking about the code of Thanos or the Azure's library? I would vote in favor of making |
cc @vglafirov do you have any opinion on this? |
The first part of disabling logs was for thanos. The 404 being treated as an error is on the azure library side. I wouldn't mind disabling the logs as most of the time they don't give more information than thanos does. Maybe only activate the logs on the azure library in debug mode? |
Yep, I think that would be the best solution here with how the library works. We could do a similar thing to what s3 has: https://github.com/thanos-io/thanos/pull/937/files#diff-23ce6f3804b08b263e06e0c152bf4ee5R54-R55. IMHO adding another log level setting would be a bit too much since the log levels there do not match up with what Thanos has: https://github.com/thanos-io/thanos/blob/master/cmd/thanos/main.go#L50-L51 vs. https://github.com/Azure/azure-pipeline-go/blob/master/pipeline/core.go#L59-L80 and inevitably confusion would arise. And we'd have two options for the same thing, essentially :/ What are your thoughts on this? |
Yea, I also think that adding another flag would create confusion and would not be necessary. I would say that maybe only activate the logs on the library when thanos log level is in debug. Not sure what level should be applied to the framework tho. Maybe also debug? I think that is what makes more sense. I just think that, unless you really want to check whats happening in the library due to some strange behaviour, the logs lines are irrelevant as thanos already gives us that information. |
Hello 👋 Looks like there was no activity on this issue for last 30 days. |
Bumping the issue. Yes, this is still happening. We also discovered that even when changing the log level this happens because of this line https://github.com/Azure/azure-storage-blob-go/blob/master/azblob/zc_policy_request_log.go#L76 The requests by thanos seem to be marked with |
Hello 👋 Looks like there was no activity on this issue for last 30 days. |
Still happening |
Hello 👋 Looks like there was no activity on this issue for last 30 days. |
Any ideas if this is something to improve on Azure invocation on Thanos side or in Azure lib itself? (: Help wanted! |
I didn't investigate deep enough to check what the true issue was but it seemed like the Azure lib was not categorizing the error in a proper way (as per the links above 404 should not be logged). I tried to reduce the log level to None but still, the lib forces the logs to be written. With that said I think the solution is either to check why the error is not being categorized correctly or find a way to drop the logs completely... I'll try to investigate this further. Still didn't get the time to pick up on this 😞 |
Hey guys, I am also seeing this in the thanos compactor logs. Anything I can do to help? 😄 |
Hello 👋 Looks like there was no activity on this issue for the last two months. |
Ever since the introduction of the deletion marker, Thanos compact has been filling the logs with Azure blob storage exceptions. As discussed in #2565, the error logging should be handled by Thanos itself, so this patch turns off standard logging from the Azure library. However, [due to the way logging is set up in the library][1], it will always log errors to syslog, regardless of how `ShouldLog` is configured. As such, and until that can be a configurable behavior, care should be taken to handle that from the syslog side as well. [1]: https://github.com/Azure/azure-storage-blob-go/blob/48358e1de5110852097ebbc11c53581d64d47300/azblob/zc_policy_request_log.go#L100-L102 Signed-off-by: Pedro Araujo <phcrva@gmail.com>
@joaosilva15 Could you check if this has fixed by #3331? |
Please be aware that, due to |
I think I found the issue in the Azure blob lib, but also found a way to disable |
Thanos, Prometheus and Golang version used:
Thanos: 012.2
Go: 1.13.6
Object Storage Provider:
Azure
What happened:
After the upgrade to the newest version of thanos compactor the logging has been filled with stack traces from the azure sdk.
The error states that the deletion-mark.json cannot be found and that the api returned 404 (as expected as the blocks are not to be deleted)
What you expected to happen:
No stack traces on the logs as the error is expected.
How to reproduce it (as minimally and precisely as possible):
Running the compactor against the blob storage is enough in our case.
Full logs to relevant components:
Anything else we need to know:
I understand that this is not a thanos problem as the thanos is handling the error correctly and performing the correct operation. Just wondering if someone is having the same issue and if during the process of developing the support to azure there was the need to change the sdk log level and if so how to do it. I ran through the code and didn't find the option to do it but would be more than happy to help with a little guidance.
The text was updated successfully, but these errors were encountered: