-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize on DiagnosticSource/OpenTelemetry for database tracing #22336
Comments
To document a bit more about our discussion on Friday, This rationale includes
|
This seems counter to what I thought we'd previously discussed. In particular, DiagnosticSource only supports logging objects, and when we discussed the downsides of that from a performance perspective (e.g. allocations on each call when you have value types or multiple objects to log in a call), it was argued that these extra allocations were ok because DiagnosticSource usage would be very chunky, wouldn't be hot path, etc. But now it sounds like you're saying the default should be to use DiagnosticSource, and so rather than having strongly-typed logging methods that are allocation-free, we're pushing people by default to use these allocating, not-strongly-typed methods? I don't get it. Now, for this particular case, that probably doesn't matter, as IIRC, the tracing that's done just uses strings, which isn't particularly efficient to begin with. But in general this seems like the wrong direction to push people. And I thought one of the advantages to the DiagnosticSource design was that it did dump everything to an EventListener, so it wasn't supposed to matter whether the source was an EventSource or a DiagnosticSource, and you could pick the one best suited to the component, i.e. EventSource for most high-frequency and general logging, and DiagnosticSource for chunkier objects that you wanted to share out less frequently. |
The advice between whether to use DiangosticSource or EventSource is indeed not very clear cut some cases (when the messages are high volume and/or EventListeners/ETW are your main scenario. Indeed if you goal is to get messages from your instrumented code to ETW, using DiagnosticSource just ads overhead and complexity. I tried to indicate this in my previous comment:
So I think most of the guidance is consistent: we both agree EventSource should be used in the high volume case. We also agree that DiagnosticSource is needed if there is a true in-proc case where 'chunky' things need to be passed. The only issue is what is the 'default' (when the volume is not high and you may or may not need the in-proc capabilities). I must admit I am conflicted about this, and at this second-order details matter, but recommending DiagnosticSource is not unreasonable. Uniformity is valuable, especially of messages likely to be used together (e.g. from strongly related components). The using DiagnosticSource also ALLOWs you to be chunky (pass unserializable things) even if you don't need it initially. Reasonable people can disagree here, and I am frankly not happy with the fact that the guidance is not as clear cut. Still, for me, at least, was enough to tip the balance in this particular case. If you disagree, by all means do so (that is why I added you to this issue). |
You might want to do bar check for 2.0. We are way deep in Escrow. |
@vancem @stephentoub @saurabh500 I looked at the code in more detail and apparently I was completely wrong with my estimates. It seems that the number of calls to members of I think it might be too late to change this, unless we find a way to just change the implementation of @karelz thanks. I will follow up. |
For the “simple fix,” the number of call sites doesn’t really matter: the whole change could be done in the DataCommonEventSource type to keep the change localized and minimal, e.g. I'm still not convinced that's something we should do; I'm still not clear on what value such a switch would actually bring. And it adds cost, e.g. the allocations here stephentoub/corefx@22bdd08#diff-d2371fa205738bbb9e02dbcd7f75aac5R80, here stephentoub/corefx@22bdd08#diff-d2371fa205738bbb9e02dbcd7f75aac5R66, and here stephentoub/corefx@22bdd08#diff-d2371fa205738bbb9e02dbcd7f75aac5R20. That said, the current EventSource usage is a port from the previous tracing that existed in the desktop version of the component, and doing it well would involve further changes even for EventSource. In particular, right now it's using the same basic approach that the previous tracing was: all Trace calls result in a string being generated and that string being traced. Doing this more efficiently would instead involve a multitude of different WriteEvent overloads being used to avoid creating strings from the data being traced. I at least invested enough here when I ported this so that allocations and the associated string creation work only happens if logging is enabled, but we could take it further. |
I think we should do nothing. The current logging is string based, and frankly need to be reworked. The only question is whether this logging is useful enough as it is to leave in (and thus take a compatibility burden on). |
Thanks @vancem for looking at this. I am leaning towards disabling this code to avoid the compatibility burden and get a chance to do a design with more time. I assume we can just remove (or #ifdef) code in @saurabh500 @stephentoub thoughts? |
Based on my experience dealing with ADO.Net I know that this tracing is used when Microsoft engineers get involved with the CSS to collect traces to diagnose issues with Applications running in the field. The traces have been used by the engineers to diagnose issues in ADO.Net. All the tracing from SqlClient was stripped out and never re-instated and we have tough time figuring out what issues the customers are running into. |
IF the goal of this logging is to simply allow ad-hoc diagnosis, then just leaving it in IN THEORY is OK, since that use does not really have a compatibility problem. The problem is that simply because it is there, people COULD take a dependency on it and thus cause issue. Options include
Obviously (1) is the simplest solution, and the time window may not be large (months), and we are talking about .NET Core, which is easier to control the breakage (it does not get updated 'under' you by windows update). So just leaving it is reasonable |
Thanks @saurabh500 and @vancem Based on this argument, I am ok with (1), e.g. saying that we are leaving what we have in place and we can revisit soon after 2.0 when we build an all up plan for ADO.NET, SqlClient and other providers. When the time comes, we can then decide if it is necessary and ok to break. So, I am punting this. |
As per the last comment, this is something we would like to revisit when/if we build common diagnostics infrastructure for ADO.NET providers. We are not planning to do this for 2.1, so moving to backlog. |
Small note after looking at this again: a quick grep in dotnet/runtime's System.Data.* reveals that all current EventSources seem to be emitted from DataSet/DataTable/DataRow and related types; in other words, they are not emitted from any actual provider-related code. |
Take the Activity API into account as well when looking into this (e.g. open-telemetry/opentelemetry-dotnet#675). |
@roji should we mark this as Cut or remove this from the .NET 6 project? Trying to clean things up a bit. |
@mairaw thanks, yeah. I'm both removing this from the project and closing the issue as I no longer think it's very relevant. Since this issue was opened, OpenTelemetry has emerged as the tracing standard, and has an experimental spec for database client semantic conventions. ADO.NET providers should simply implement that. |
Currently we have around 27
EventSource
trace points inSystem.Data.Common
.On the other hand, SqlClient has been instrumented with
DiagnosticSource
.I talked to @vancem today and we came to the conclusion that there may be enough value in standardizing on
DiagnosticSource
before we release 2.0. This would make the story more consistent for consumers of ADO.NET events (we are planning to add moreDiagnosticSource
instrumentation and make it easier for ADO.NET providers to add it) and once we ship theEventSource
instrumentation it is unlikely that we will be able to remove it.cc @saurabh500
The text was updated successfully, but these errors were encountered: