-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datadog_logs sink incorrectly moves the host attribute into the hostname attribute #17147
Comments
Hello, Indeed it looks like we overlooked the impact this change would have to users- we failed to mention this change in behavior in the v0.28.0 upgrade guide (where this change was first introduced). I will open a PR to add this note to the upgrade guide. However, as to correctness, the change was made in order to adhere to the Datadog Agent spec (https://docs.datadoghq.com/api/latest/logs/) , making it in line with the requirements for this source. |
Hello! I believe the linked API documentation is by no means prescriptive as to how events should be submitted to Datadog. The schema you linked seems incomplete. I see it as a valid example, but not as a rigid prescription. The 6 reserved attributes, tell a different story where I did run a test to submit an event to the intake without the
The HTTP response was 202, and I found the event in the log explorer shortly thereafter: The forced attribute remapping introduced in 0.28 makes for a real regression in our environment because we loose EC2 metadata in addition to loosing the information previously contained in the Should you keep the stance that the way things are since 0.28 is desirable, would you agree on providing an opt out from this forced attribute renaming? |
Sorry for the delay here, issue got dropped when we were rotating our GH responder. I spoke with @neuronull today and we're in agreement about reverting this change. I'm not sure if we'll cut a patch release or backport this to Additionally I'll open an issue to track collaborating with the logs intake team at Datadog to ensure we have a better contract and don't cause these sorts of regressions going forward. |
Hi @mrzor ! We had some discussion today about this issue and decided we feel that the new behavior is more consistent with other sinks (and internally consistent within the I realize it causes suboptimal behavior in your case though where you don't want to overwrite the |
Hi Jesse! Thanks for the update and the re-ignited activity on this topic. I'll do my best to highlight the number of reasons why I think that rollback is the way to go alongside some suggestions. The reference to the new log schemas in the documentation remain quite unclear to me, so I find it hard to articulate something useful in that context. I can however attempt to answer the direct questions you put forward.
This is half of the issue. The other half of the issue is that the original key is moved and not duped. So not only something is overwritten, but an key-value pair also gets mauled (in my case,
If my understanding of the guideline is correct, having expected locations for anything in the first place seems in contradiction with "Vector does not want to be in the business of maintaining log schemas. An exercise that has proven precarious with the proliferation of log schema standards.". An exception can be made for the timestamp field, but this is also a can of worms: timezone issues and clock drifts are the issues we run into most often. While most (all?) log systems require timestamps to be meaningful, should Vector be opinionated in that area? Why wouldn't it be left to the configuration instead of default behavior. This is the position the guideline states, if I understand it correctly. Anecdotally, I believe it is valid to send non timestamped logs to Datadog, in which case they'll be timestamped at intake time. This also happens when the timestamp is outright invalid. Few classes of pathological timestamps (like way in the future or way in the past) result in rejection.
When a specific sink adds requirements on the log payload, such as having a mandatory host_key, then perhaps the existing schema infrastructure can be leveraged to produce warnings or even configuration errors, forcing the user to produce a configuration that respects such requirements. Warnings feel better because it empowers the user to have the last word, which feels alright to me. For instance, let's say you have a UnicornLogs sink that requires the rainbow attribute to be set. Maybe I'm piping it to something that implements UnicornLogs without such a requirement - in which case normalizing it for me, or forcing me to add it, is actively adverse to the intent. In the context of this specific issue, it feels like the unexpected behavior change stems from a good intention, namely to conform to Datadog's log intake specifications. However, the API docs have been taken as normative where they're just advisory - I'm confident you will find someone internally that will confirm this, possibly If they are indeed advisory, then maybe the Vector Datadog sink should limit its actions to advices as well? That would keep it in tune with the service it represents. |
Just wanted to drop a quick update, without any actual decision from our end @mrzor - according to the team that owns the API, the documented body (with I'm writing up some thoughts and proposals to discuss with the team in the near future. |
Dropping some additional notes as I do research here, the They're following a similar pattern of looking for the usual attributes and renaming it to what the API expects (they also rename to |
Hello! We are biting the bullet and going along with the change - being stuck on 0.27 is painful. We like our software up to date :) Here's what we're rolling with - it keeps the AWS integration working, and the data we used to store in
|
Thanks for the update @mrzor ! We are still going back-and-forth on the expected behavior here 😓 |
Recapping some info for future readers. Today we do something similar to the OpenTelemetry The logs API itself will generally accept any object and render it best effort for the frontend, and users can configure their JSON Preprocessing to target specific fields for Datadog attributes. Longer term we could force the handling of this translation via semantic meaning. |
A note for the community
Problem
After upgrading to 0.28.2, we noticed our EC2 metadata was not reconciled after Datadog intake anymore, the
hostname
attribute contained the host IP (which should be contained in the host attribute) and thehost
attribute was missing. By adding amitmproxy
between Vector and the Datadog intake, we confirmed that the logs leaving Vector exhibited this defect, and so it was unrelated to JSON attribute preprocessing that happens after intake.Configuration
In our setup, events have both
host
andhostname
attributes set before reaching Vector. On Datadog's side, the JSON attribute preprocessing forhost
is set tohostname
. In our setup, the host ip address is stored in thehost
attribute, and the instance id stored in thehostname
attribute. Having the instance ID recognized ashost
by Datadog is central to having EC2 metadata reconciled past the Datadog intake, but given our setup, this requires it to be in thehostname
attribute - a bit convoluted, but that's how things have been setup a while back.Version
0.28.2
Possible solutions
We believe the change from
host
tohostname
in faa96f8 had this unforeseen consequence. Whereas Vector previously "renamed" thehost
attribute tohost
leaving thehostname
attribute intact, the change causes ourhostname
attribute to be erased by the host attribute, and the host attribute dropped. We believe the best course of action is for this part of the change to be reverted.We could alter our configuration in a number of ways, notably, we could remap the
hostname
attribute tohostname_
and use Datadog's JSON preprocessing facilities to achieve the EC2 metadata reconciliation. We would also need to set the host_key to something else than host to dodge the unwanted renaming.The text was updated successfully, but these errors were encountered: