Standardize on DiagnosticSource/OpenTelemetry for database tracing #22336

divega · 2017-06-17T00:46:20Z

Currently we have around 27 EventSource trace points in System.Data.Common.

On the other hand, SqlClient has been instrumented with DiagnosticSource.

I talked to @vancem today and we came to the conclusion that there may be enough value in standardizing on DiagnosticSource before we release 2.0. This would make the story more consistent for consumers of ADO.NET events (we are planning to add more DiagnosticSource instrumentation and make it easier for ADO.NET providers to add it) and once we ship the EventSource instrumentation it is unlikely that we will be able to remove it.

cc @saurabh500

The text was updated successfully, but these errors were encountered:

vancem · 2017-06-19T16:36:40Z

To document a bit more about our discussion on Friday, This rationale includes

It is good to be uniform at least for components that are likely to be used together in logging scenarios. Having SQL Client and System.Data.Common having different logging mechanisms is not ideal.
If we want to change, we should do it now. My understanding is that there is no dependency on the existing logging (yet), but we will not be able to remove the EventSource logging after 2.0 ships.
The change should be modest in terms of lines of code. (since the logging sites are already wrapped in helpers, it should be a 1 day kind of a change (which is also very safe).
IF there are no issues with dependency (that is you live in the new Nuget package world), we generally recommend people instrument with DiagnosticSource because we have a bridge from DiagnosticSource to EventSource. (thus you get both), There are some potential downsides (Going through the bridge is extra overhead, and getting at the logging from EventSource is not as natural to EventListeners), so it is not a 'slam dunk' if high efficiency, and if the EventSource scenario (out of process), is the most important, but we do generally advice people to only instrument with DiagnosticSource. Thus moving System.Data.Common to DiagnosticSource is reasonable (assuming no dependency problems).

@stephentoub

stephentoub · 2017-06-19T17:14:16Z

we generally recommend people instrument with DiagnosticSource because we have a bridge from DiagnosticSource to EventSource.

This seems counter to what I thought we'd previously discussed. In particular, DiagnosticSource only supports logging objects, and when we discussed the downsides of that from a performance perspective (e.g. allocations on each call when you have value types or multiple objects to log in a call), it was argued that these extra allocations were ok because DiagnosticSource usage would be very chunky, wouldn't be hot path, etc. But now it sounds like you're saying the default should be to use DiagnosticSource, and so rather than having strongly-typed logging methods that are allocation-free, we're pushing people by default to use these allocating, not-strongly-typed methods? I don't get it.

Now, for this particular case, that probably doesn't matter, as IIRC, the tracing that's done just uses strings, which isn't particularly efficient to begin with. But in general this seems like the wrong direction to push people. And I thought one of the advantages to the DiagnosticSource design was that it did dump everything to an EventListener, so it wasn't supposed to matter whether the source was an EventSource or a DiagnosticSource, and you could pick the one best suited to the component, i.e. EventSource for most high-frequency and general logging, and DiagnosticSource for chunkier objects that you wanted to share out less frequently.

vancem · 2017-06-19T19:10:50Z

The advice between whether to use DiangosticSource or EventSource is indeed not very clear cut some cases (when the messages are high volume and/or EventListeners/ETW are your main scenario. Indeed if you goal is to get messages from your instrumented code to ETW, using DiagnosticSource just ads overhead and complexity. I tried to indicate this in my previous comment:

There are some potential downsides (Going through the bridge is extra overhead, and getting at the logging from EventSource is not as natural to EventListeners), so it is not a 'slam dunk' if high efficiency, and if the EventSource scenario (out of process), is the most important,

So I think most of the guidance is consistent: we both agree EventSource should be used in the high volume case. We also agree that DiagnosticSource is needed if there is a true in-proc case where 'chunky' things need to be passed. The only issue is what is the 'default' (when the volume is not high and you may or may not need the in-proc capabilities).

I must admit I am conflicted about this, and at this second-order details matter, but recommending DiagnosticSource is not unreasonable. Uniformity is valuable, especially of messages likely to be used together (e.g. from strongly related components). The using DiagnosticSource also ALLOWs you to be chunky (pass unserializable things) even if you don't need it initially.

Reasonable people can disagree here, and I am frankly not happy with the fact that the guidance is not as clear cut. Still, for me, at least, was enough to tip the balance in this particular case.

If you disagree, by all means do so (that is why I added you to this issue).

karelz · 2017-06-20T00:12:30Z

You might want to do bar check for 2.0. We are way deep in Escrow.

divega · 2017-06-20T01:07:55Z

@vancem @stephentoub @saurabh500 I looked at the code in more detail and apparently I was completely wrong with my estimates. It seems that the number of calls to members of DataCommonEventSource in System.Data.Common is actually closer to 300 . Sorry for the misleading information I provided previously.

I think it might be too late to change this, unless we find a way to just change the implementation of DataCommonEventSource.

@karelz thanks. I will follow up.

stephentoub · 2017-06-20T14:54:59Z

I looked at the code in more detail and apparently I was completely wrong with my estimates. It seems that the number of calls to members of DataCommonEventSource in System.Data.Common is actually closer to 300

For the “simple fix,” the number of call sites doesn’t really matter: the whole change could be done in the DataCommonEventSource type to keep the change localized and minimal, e.g.
stephentoub/corefx@22bdd08
This isn't ideal, but it would at least move from using EventSource to DiagnosticSource. Of course, if the goal is to trace out more of the actual objects being used, then yeah, the change would be much more invasive across the hundreds of sites currently tracing, and maybe more that aren't yet.

I'm still not convinced that's something we should do; I'm still not clear on what value such a switch would actually bring. And it adds cost, e.g. the allocations here stephentoub/corefx@22bdd08#diff-d2371fa205738bbb9e02dbcd7f75aac5R80, here stephentoub/corefx@22bdd08#diff-d2371fa205738bbb9e02dbcd7f75aac5R66, and here stephentoub/corefx@22bdd08#diff-d2371fa205738bbb9e02dbcd7f75aac5R20.

That said, the current EventSource usage is a port from the previous tracing that existed in the desktop version of the component, and doing it well would involve further changes even for EventSource. In particular, right now it's using the same basic approach that the previous tracing was: all Trace calls result in a string being generated and that string being traced. Doing this more efficiently would instead involve a multitude of different WriteEvent overloads being used to avoid creating strings from the data being traced. I at least invested enough here when I ported this so that allocations and the associated string creation work only happens if logging is enabled, but we could take it further.

vancem · 2017-06-21T15:26:46Z

I think we should do nothing. The current logging is string based, and frankly need to be reworked. The only question is whether this logging is useful enough as it is to leave in (and thus take a compatibility burden on).

divega · 2017-06-21T17:54:29Z

Thanks @vancem for looking at this. I am leaning towards disabling this code to avoid the compatibility burden and get a chance to do a design with more time. I assume we can just remove (or #ifdef) code in DataCommonEventSource.

@saurabh500 @stephentoub thoughts?

saurabh500 · 2017-06-21T18:22:15Z

Based on my experience dealing with ADO.Net I know that this tracing is used when Microsoft engineers get involved with the CSS to collect traces to diagnose issues with Applications running in the field. The traces have been used by the engineers to diagnose issues in ADO.Net.
I haven't heard of any customers which consume these traces directly to bother figuring out what is wrong with the library.

All the tracing from SqlClient was stripped out and never re-instated and we have tough time figuring out what issues the customers are running into.
I am in favor of going with the current state of tracing and not changing anything at all so that we have a way to diagnose any issues with the DataSet and DataTable which is where this tracing would really be useful. It's better than having nothing.

vancem · 2017-06-21T23:01:14Z

IF the goal of this logging is to simply allow ad-hoc diagnosis, then just leaving it in IN THEORY is OK, since that use does not really have a compatibility problem. The problem is that simply because it is there, people COULD take a dependency on it and thus cause issue.

Options include

Simply breaking it when the time comes (assume no one has taken a dependency on it).
Log a disclaimer that this logging may be changed/deleted (helps ease our minds)
Put the logging behind a Environment variable like ADO_NET_LOGGING_EXPERIMENTAL that makes it unlikely that a programmatic dependency would be taken.

Obviously (1) is the simplest solution, and the time window may not be large (months), and we are talking about .NET Core, which is easier to control the breakage (it does not get updated 'under' you by windows update). So just leaving it is reasonable

divega · 2017-06-22T00:21:56Z

Thanks @saurabh500 and @vancem Based on this argument, I am ok with (1), e.g. saying that we are leaving what we have in place and we can revisit soon after 2.0 when we build an all up plan for ADO.NET, SqlClient and other providers. When the time comes, we can then decide if it is necessary and ok to break. So, I am punting this.

divega · 2018-01-30T17:57:44Z

As per the last comment, this is something we would like to revisit when/if we build common diagnostics infrastructure for ADO.NET providers. We are not planning to do this for 2.1, so moving to backlog.

roji · 2020-02-05T20:59:19Z

Small note after looking at this again: a quick grep in dotnet/runtime's System.Data.* reveals that all current EventSources seem to be emitted from DataSet/DataTable/DataRow and related types; in other words, they are not emitted from any actual provider-related code.

roji · 2021-02-17T07:15:56Z

Take the Activity API into account as well when looking into this (e.g. open-telemetry/opentelemetry-dotnet#675).

mairaw · 2023-05-26T20:19:35Z

@roji should we mark this as Cut or remove this from the .NET 6 project? Trying to clean things up a bit.

roji · 2023-05-30T18:00:56Z

@mairaw thanks, yeah. I'm both removing this from the project and closing the issue as I no longer think it's very relevant. Since this issue was opened, OpenTelemetry has emerged as the tracing standard, and has an experimental spec for database client semantic conventions. ADO.NET providers should simply implement that.

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020

ajcvickers removed the untriaged New issue has not been triaged by the area owner label Jun 23, 2020

ajcvickers modified the milestones: Future, 6.0.0 Dec 7, 2020

ajcvickers assigned roji Dec 7, 2020

roji added the User Story A single user-facing feature. Can be grouped under an epic. label Feb 10, 2021

roji changed the title ~~Standarize on DiagnosticSource for tracing in System.Data.* components~~ Standarize on DiagnosticSource/OpenTelemetry for tracing in System.Data.* components Feb 10, 2021

roji mentioned this issue Feb 10, 2021

.NET applications are observable and easily diagnosable dotnet/core#5929

Closed

18 tasks

roji changed the title ~~Standarize on DiagnosticSource/OpenTelemetry for tracing in System.Data.* components~~ Standarize on DiagnosticSource/OpenTelemetry for tracing in System.Data components Feb 10, 2021

roji changed the title ~~Standarize on DiagnosticSource/OpenTelemetry for tracing in System.Data components~~ Standardize on DiagnosticSource/OpenTelemetry for tracing in System.Data components Feb 10, 2021

roji changed the title ~~Standardize on DiagnosticSource/OpenTelemetry for tracing in System.Data components~~ Standardize on DiagnosticSource/OpenTelemetry for database tracing Feb 10, 2021

roji mentioned this issue Feb 12, 2021

Continue evolving System.Data and the database providers #48253

Closed

6 tasks

roji mentioned this issue Feb 17, 2021

Why isn't the Activity API used to track diagnostic events? dotnet/SqlClient#917

Closed

roji mentioned this issue Feb 17, 2021

Why isn't the Activity API used to track diagnostic events? dotnet/efcore#24171

Closed

ajcvickers modified the milestones: 6.0.0, 7.0.0 Jul 14, 2021

ajcvickers modified the milestones: 7.0.0, Future Aug 8, 2022

roji closed this as not planned Won't fix, can't repro, duplicate, stale May 30, 2023

roji removed this from the Future milestone May 30, 2023

ghost locked as resolved and limited conversation to collaborators Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize on DiagnosticSource/OpenTelemetry for database tracing #22336

Standardize on DiagnosticSource/OpenTelemetry for database tracing #22336

divega commented Jun 17, 2017

vancem commented Jun 19, 2017 •

edited by divega

Loading

stephentoub commented Jun 19, 2017 •

edited

Loading

vancem commented Jun 19, 2017

karelz commented Jun 20, 2017

divega commented Jun 20, 2017 •

edited

Loading

stephentoub commented Jun 20, 2017 •

edited

Loading

vancem commented Jun 21, 2017

divega commented Jun 21, 2017

saurabh500 commented Jun 21, 2017

vancem commented Jun 21, 2017

divega commented Jun 22, 2017

divega commented Jan 30, 2018

roji commented Feb 5, 2020

roji commented Feb 17, 2021

mairaw commented May 26, 2023

roji commented May 30, 2023

Standardize on DiagnosticSource/OpenTelemetry for database tracing #22336

Standardize on DiagnosticSource/OpenTelemetry for database tracing #22336

Comments

divega commented Jun 17, 2017

vancem commented Jun 19, 2017 • edited by divega Loading

stephentoub commented Jun 19, 2017 • edited Loading

vancem commented Jun 19, 2017

karelz commented Jun 20, 2017

divega commented Jun 20, 2017 • edited Loading

stephentoub commented Jun 20, 2017 • edited Loading

vancem commented Jun 21, 2017

divega commented Jun 21, 2017

saurabh500 commented Jun 21, 2017

vancem commented Jun 21, 2017

divega commented Jun 22, 2017

divega commented Jan 30, 2018

roji commented Feb 5, 2020

roji commented Feb 17, 2021

mairaw commented May 26, 2023

roji commented May 30, 2023

vancem commented Jun 19, 2017 •

edited by divega

Loading

stephentoub commented Jun 19, 2017 •

edited

Loading

divega commented Jun 20, 2017 •

edited

Loading

stephentoub commented Jun 20, 2017 •

edited

Loading