Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need better logging for the native SDK resolver #15149

Open
KirillOsenkov opened this issue Jan 3, 2021 · 23 comments
Open

Need better logging for the native SDK resolver #15149

KirillOsenkov opened this issue Jan 3, 2021 · 23 comments
Assignees
Milestone

Comments

@KirillOsenkov
Copy link
Member

I don't see a way to debug/investigate the decision process of the SDK resolver native .dll.

I've gotten as far as

int errorCode = Interop.RunningOnWindows
? Interop.Windows.hostfxr_resolve_sdk2(dotnetExeDirectory, globalJsonStartDirectory, flags, result.Initialize)
: Interop.Unix.hostfxr_resolve_sdk2(dotnetExeDirectory, globalJsonStartDirectory, flags, result.Initialize);

but from then on it's a black box and I don't see anything that would explain the resulting SDK that comes out of it. I want to know what factors influenced the decision:

  • did it use an environment variable?
  • did it use PATH?
  • did it use global.json?
  • which global.json?
  • was IsRunningInVisualStudio specified?

The resolver should explain its choice and clearly communicate which factors made a difference and which didn't. Otherwise I have to resort to cargo culting it and so far unsuccessfully.

@KirillOsenkov
Copy link
Member Author

KirillOsenkov commented Jan 3, 2021

As an example, dotnet/msbuild#5999 is an issue that took me a whole day to figure out, and as a result of that investigation I have emerged with the knowledge that I should have set these three environment variables to influence the SDK resolution process:

PATH=C:\msbuild\.dotnet;%PATH%
DOTNET_INSTALL_DIR=C:\msbuild\.dotnet
DOTNET_MULTILEVEL_LOOKUP=0

Had there been a log entry like "choosing 3.1.100 because the DOTNET_INSTALL_DIR was set explicitly and pointed to a 3.1.100 SDK at C:\msbuild.dotnet" I could have binged an entire season of something wholesome today instead of debugging deep into the SDK resolution.

Currently the only logging I see is
image

@KirillOsenkov
Copy link
Member Author

Correction, I also had to set PATH=C:\msbuild\.dotnet;%PATH%

@rainersigwald
Copy link
Member

I don't know if the logging interface currently exposed by MSBuild makes this possible--it's fairly minimal. We might need to augment that first before improving the dotnet resolver.

@marcpopMSFT marcpopMSFT added the untriaged Request triage from a team member label Jan 12, 2021
@dsplaisted
Copy link
Member

@KirillOsenkov Are you saying you need more visibility into what the native hostfxr_resolve_sdk2 method is doing, or what the managed code calling it does with the result? (Or are you not sure?)

@KirillOsenkov
Copy link
Member Author

The former, I want to know what decisions the native hostfxr_resolve_sdk2 did and why. Because it's impossible to debug, all I get is a decision, but why it chose that SDK is completely unclear. It should say "used this environment variable" or something.

@KirillOsenkov
Copy link
Member Author

And then I guess it's a separate story to thread that output through the managed code so it ends up as an Evaluation message in the binlog

@dsplaisted dsplaisted added needs team triage Requires a full team discussion and removed untriaged Request triage from a team member labels Jan 27, 2021
@dsplaisted dsplaisted removed their assignment Jan 27, 2021
@marcpopMSFT marcpopMSFT added this to the 6.0.1xx milestone Jan 27, 2021
@marcpopMSFT marcpopMSFT removed the needs team triage Requires a full team discussion label Jan 27, 2021
@dsplaisted
Copy link
Member

I think we can try to add some better logging in the managed SDK resolvers first. Then if we can get that piped through, we can try to add additional logging to the native methods to include.

@KirillOsenkov
Copy link
Member Author

It’s a good plan

@KirillOsenkov
Copy link
Member Author

Folks, this issue is supremely painful. Here's the error that I get:

Error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found.

I'm literally left stranded and helpless. Every time I investigate this, I end up spending a few hours debugging deep into the SDK, but if I'm a regular user, I'm just absolutely helpless. We really need to invest into improving this experience. Please @marcpopMSFT @KathleenDollard @dsplaisted

@marcpopMSFT
Copy link
Member

FYI, it's not visible in GH but we've put this in our .net 6 backlog pipeline. That puts it behind our .net 6 committed work but ahead of the other 1600 issues we have. There is no commitment when we'll do this but we'll likely start looking at that list closer to the end of .net 6 timeframe and maybe slot in some of the items listed before .net 6 ships.

@KirillOsenkov
Copy link
Member Author

As long as it’s on the radar and being tracked, I’m happy :)

@marcpopMSFT marcpopMSFT modified the milestones: 6.0.1xx, 7.0.1xx Oct 20, 2021
@benvillalobos
Copy link
Member

This bit me while figuring out #24512. Logging the path it's trying to check would be a good next step.

@KirillOsenkov
Copy link
Member Author

Wasted a couple hours of my life on this again today.

@dsplaisted
Copy link
Member

#26904 should help with this. It doesn't directly surface messages from the native resolver, but you can see a lot of what is going into it and coming back out.

It's also possible to set environment variables to get logs from the native resolver: https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/default-probing#how-do-i-debug-the-probing-properties-construction

@KirillOsenkov
Copy link
Member Author

It's great to have some developments here.

Having to set environment variables to obtain logs is a huge UX hurdle. Nobody will even know about these. Any reason these can't be on by default?

Also the last time I was looking, SdkLogger was not used for anything:
dotnet/msbuild#7988 (comment)

@dsplaisted
Copy link
Member

I'd like to be able to include the messages from the native logger in the resolver failure logs. We'd need to update the native hostfxr code for that, and there might be a perf impact to always generating those messages.

For now we're hoping that the #26904 is a big step forward, and some time after that's gone in we can assess what further improvements we should make.

@marcpopMSFT marcpopMSFT removed this from the 7.0.1xx milestone Nov 23, 2022
@marcpopMSFT marcpopMSFT added this to the 7.0.2xx milestone Nov 23, 2022
@KirillOsenkov
Copy link
Member Author

KirillOsenkov commented Dec 14, 2022

Should I file a new issue to improve the error message for MSB4236?

Currently it says:
error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found.

It doesn't say:

  • which SDK was specified
  • where was it specified (e.g. global.json?)
  • which SDKs were found
  • where did it look
  • what to do next?

I can imagine a better text:
error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found. global.json specifies '7.0.100' with rollForward: disable. The nearest installed SDK was 6.0.306. You can install the SDK from https://dot.net.

Also if possible consider grouping these and only show once instead of per-project (if building a solution). Chances are you will have one per project, so a hundred of these same errors for a hundred project solution.

Long term ideally there should be dotnet sdk restore that will automatically download and install the missing SDK. The error message could prompt then, or even have a mode where it does it automatically.

@marcpopMSFT
Copy link
Member

@KirillOsenkov that error is for resolving the sdk that's at the top of a project file, not the .NET SDK.
<Project Sdk="Microsoft.NET.Sdk">

Reading through the history here, I'm not sure if you're original issue was about finding the .NET SDK or resolving an sdk as later comments were more about the latter but your original description was about the former. The PR that Daniel linked will help with the latter.

@KirillOsenkov
Copy link
Member Author

I didn’t realize they’re different! If I’m confused, imagine how our users will be confused. The error message needs to improve either way.

@marcpopMSFT marcpopMSFT modified the milestones: 7.0.2xx, Backlog Jan 18, 2023
@KirillOsenkov
Copy link
Member Author

@Forgind

@KirillOsenkov
Copy link
Member Author

long sad story

@KirillOsenkov
Copy link
Member Author

Got another difficult to diagnose issue today with this build failure:

myproject.csproj : error : Could not resolve SDK "Microsoft.NET.Sdk". Exactly one of the probing messages below indicates why we could not resolve the SDK. Investigate and resolve that message to correctly specify the SDK.
myproject.csproj : error :   The NuGetSdkResolver did not resolve this SDK because there was no version specified in the project or global.json.
myproject.csproj : error :   MSB4276: The default SDK resolver failed to resolve SDK "Microsoft.NET.Sdk" because directory "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Sdks\Microsoft.NET.Sdk\Sdk" did not exist.
myproject.csproj : error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found.

The error message incorrectly implies that the directory C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Sdks\Microsoft.NET.Sdk\Sdk should exist, whereas in reality the missing directory was likely C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\SdkResolvers\Microsoft.DotNet.MSBuildSdkResolver

@KirillOsenkov
Copy link
Member Author

These should help:

set COREHOST_TRACE=1
set COREHOST_TRACE_VERBOSITY=4
set DOTNET_MSBUILD_SDK_RESOLVER_ENABLE_LOG=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants