Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DD_DBM_PROPAGATION_MODE = full on Azure App Service not working #6044

Open
jeremytbrun opened this issue Sep 17, 2024 · 20 comments
Open

DD_DBM_PROPAGATION_MODE = full on Azure App Service not working #6044

jeremytbrun opened this issue Sep 17, 2024 · 20 comments

Comments

@jeremytbrun
Copy link

Describe the bug
I have updated the Azure App Service to the latest version 3.3.0 in order to try full DD_DBM_PROPAGATION_MODE for SQL Server. Previously I was using service mode for DD_DBM_PROPAGATION_MODE and it was working fine. Now I'm getting no propagation data whatsoever. I noticed that upon startup the dotnet-tracer-loader-w3wp-.log is getting spammed with the following over and over again. See attached.
dotnet-tracer-loader-w3wp-6096.log

@jeremytbrun
Copy link
Author

These dotnet-tracer-loader log files continue to pile up.

@jeremytbrun
Copy link
Author

I just did a fresh install of the extension and even with DD_DBM_PROPAGATION_MODE set to "service" with version 3.3.0 of the extension I get the same thing.

@bouwkast
Copy link
Contributor

Hi @jeremytbrun

We are looking into this now. As a potential step could you try seeing if setting DD_DBM_PROPAGATION_MODE to disabled resolves the issue? (obviously only a temporary workaround to the logs getting spammed)

I'm unsure based on those logs if the Tracer is even initializing.

@jeremytbrun
Copy link
Author

Hi @jeremytbrun

We are looking into this now. As a potential step could you try seeing if setting DD_DBM_PROPAGATION_MODE to disabled resolves the issue? (obviously only a temporary workaround to the logs getting spammed)

I'm unsure based on those logs if the Tracer is even initializing.

Yes, I'll try this now. Stand by.

@jeremytbrun
Copy link
Author

jeremytbrun commented Sep 17, 2024

Just did a clean install of v3.3.0 with DD_DBM_PROPAGATION_MODE set to disabled and I'm getting the same behavior. Confirmed that since I first updated from v3.2.0 to v3.3.0 I am not getting ANY trace data within Datadog from this service.

Everything worked fine with v3.2.0 and DD_DBM_PROPAGATION_MODE set to service.

@bouwkast
Copy link
Contributor

Okay thanks for checking and confirming.

I'm following up with other team members at the moment, I'm wondering if there is an issue with the 3.3.000 AAS extension, but I want to see what some others have say

@jeremytbrun
Copy link
Author

I wondered the same but comparing the two most recent versions there really isn't much there other than the bump to the tracer library version.

DataDog/datadog-aas-extension@dotnet-v3.2.000...dotnet-v3.3.000

@bouwkast
Copy link
Contributor

@jeremytbrun

If you could, could you try to enable DD_TRACE_DEBUG (set to 1 or true) and see if we get additional debug logs? Also wondering if we have some managed logs from the Tracer.

We have a test AAS application running on 3.3.000 and it seems to be working normally from what we can tell and don't see those same issues.

I'm wondering if there is a way (maybe via commandline) to specify a specific version of the extension - 3.2.000 so we could at least get it back up and running for you.

Also please don't hesitate to also open a ticket here: https://help.datadoghq.com/hc/requests/new?tf_1260824651490=pt_product_type:apm&tf_1900004146284=pt_apm_language:.net

But I will continue looking into it.

@jeremytbrun
Copy link
Author

I do have DD_TRACE_DEBUG set to on. Is there a log file you'd like to see?

@bouwkast
Copy link
Contributor

Would you be willing to share all of those logs with us?

@bouwkast
Copy link
Contributor

Would you be willing to share all of those logs with us?

To clarify on this as I wasn't very clear sorry

Would you be willing to share all of those logs with us via that support ticket and upload them that way so it can be done privately. :)

@bouwkast
Copy link
Contributor

Also, if you have profiling enabled, could you try without that to rule out whether it something within the Profiler part of the code?

DD_PROFILING_ENABLED set to false for that.

@bouwkast
Copy link
Contributor

Hi @jeremytbrun

We think we've identified the cause of this issue and are working on a hotfix for this as soon as we can. We will keep you posted on this though.

@jeremytbrun
Copy link
Author

That's wonderful news. I will be able to do some more troubleshooting on this tomorrow. Also once this is working I should be able to see database query explain plans directly from an APM trace right? Like in this documentation.

@jeremytbrun
Copy link
Author

I was able to tinker with this more. I tried deleting my Azure App Service fully and redeploying again with the v3.3.0 of the extension. This time things seem to be working. But I can't explain why. This isn't the first time I've seen flakiness with the installation/startup process of the extension. I know in the Datadog documentation it explicitly says to Stop the app service before making configuration changes to the extension. I had done that multiple times already. Why does it need to be fully stopped before configuring/installing the extension?

@jeremytbrun
Copy link
Author

Things seem to be functioning. I do see a new context_info element in the query samples that are being picked up. Is this a reference to an APM trace? I'm not seeing what I'd expect based on the documentation where if I go to an APM trace I should be able to see a link back to the DBM query information. Documentation I'm talking about is here.
image

@bouwkast
Copy link
Contributor

Hey @jeremytbrun

So we can track this better could you open a ticket for this DBM issue here (if you haven't already): https://help.datadoghq.com/hc/requests/new?tf_1260824651490=pt_product_type:apm&tf_1900004146284=pt_apm_language:.net

We've let the team know and they'll start looking at it, but the ticket will help gather more information and track the status of this better.

Thanks

@vandonr
Copy link
Contributor

vandonr commented Sep 26, 2024

To answer your question, yes the context info is a reference to an APM trace (and span), this is how full propagation works for SQL server.
In your case, it seems to appear that the link between the query and the trace is broken for some reason. Opening an "official" support ticket would help us investigate further as Steven said.

@vandonr
Copy link
Contributor

vandonr commented Sep 26, 2024

To gain some time, you can attach the tracer debug logs to your support ticket, making sure it covers a time frame where at least one SQL request was made.

@jeremytbrun
Copy link
Author

Thank you. I've submitted a support ticket. With some relevant data.

Request #1861599

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants