Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RemoteExecutor Swallows Errors and Throws Invalid Exceptions #6371

Open
2 tasks done
JimBobSquarePants opened this issue Oct 8, 2020 · 43 comments
Open
2 tasks done
Labels
area-Infrastructure-libraries Area maintained by .NET libraries team: APICompat, AsmDiff, GenAPI, GenFacades, PkgProj, etc help wanted

Comments

@JimBobSquarePants
Copy link

JimBobSquarePants commented Oct 8, 2020

  • This issue is blocking
  • This issue is causing unreasonable pain

I could be holding it wrong but it does not seem to be working as expected.

Microsoft.DotNet.RemoteExecutor Version 6.0.0-beta.20508.3 (Are there non beta version available?)

Installed via https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-eng/nuget/v3/index.json

The following method runs without any assertation exceptions. I would expect it to fail.

            RemoteExecutor.Invoke(() =>
            {
                 Assert.True(false);
            }).Dispose();

The following code throws an exception. I would expect none.

            RemoteExecutor.Invoke(() =>
            {
                 return RemoteExecutor.SuccessExitCode;
            }).Dispose();

Message:
Exit code was -2147450749 but it should have been 42
Expected: True
Actual: False
Stack Trace:
RemoteInvokeHandle.Dispose(Boolean disposing) line 237
RemoteInvokeHandle.Dispose() line 58

@JimBobSquarePants
Copy link
Author

JimBobSquarePants commented Oct 8, 2020

ExecutorTest.zip

Ok... So I created a separate unit test project to confirm this.

[Fact]
public void Test1()
{
    Assert.True(RemoteExecutor.IsSupported);

    RemoteExecutor.Invoke(() =>
    {
        Assert.True(false);
    }).Dispose();

    RemoteExecutor.Invoke(() =>
    {
        return RemoteExecutor.SuccessExitCode;
    }, new RemoteInvokeOptions { CheckExitCode = true }).Dispose();
}

@JimBobSquarePants
Copy link
Author

Pulling the source and running the tests leads to 3 out of 5 tests failing with the 2 passing false positives due to the symptoms described above.

image

@JimBobSquarePants
Copy link
Author

JimBobSquarePants commented Oct 9, 2020

Confirmed to not be isolated to my machine.

Linux Pass
Windows Fail

https://github.com/JimBobSquarePants/RemoteExecutorTests/runs/1232480114

@danmoseley
Copy link
Member

-2147450749
https://github.com/dotnet/runtime/blob/6072e4d3a7a2a1493f514cdf4be75a3d56580e84/src/installer/corehost/error_codes.h#L18

• CoreHostLibMissingFailure (-2147450749 0x80008083) - One of the dependent libraries is missing. Typically when the hostfxr, hostpolicy or coreclr dynamic libraries are not present in the expected locations. Probably means corrupted or incomplete installation.

@danmoseley
Copy link
Member

https://github.com/dotnet/runtime/search?q=CoreHostLibMissingFailure

Enabling host tracing might help? COREHOST_TRACE=1 COREHOST_TRACE_VERBOSITY=4
I think this goes goes to stderr so should be captured?
https://github.com/dotnet/runtime/blob/d14b50ae2186af77ad9b68370978182584b43e65/docs/design/features/host-tracing.md

@JimBobSquarePants
Copy link
Author

Thanks @danmosemsft .

I thought it was something like that and have been attempting to determine whether there is a specific SDKs missing that should be there. It's very odd that the GitHub Ubuntu image works but Windows doesn't.

Is COREHOST_TRACE=1 something I pass to dotnet test?

@danmoseley
Copy link
Member

It's an environment variable read by the host - hostfxr.dll, I think (this isn't my area). So you should be able to set it anywhere eg before launching dotnet test, or in your test just before launching the remote executor.

@JimBobSquarePants
Copy link
Author

Thanks. I’ll have a crack and report back

@JimBobSquarePants
Copy link
Author

No further info I could uncover with that variable. Just lots of build noise.

@danmoseley
Copy link
Member

Does the logging not end with an error? Hmm. @vitek-karas I believe owns the host and can help advise from here.

@JimBobSquarePants
Copy link
Author

https://github.com/JimBobSquarePants/RemoteExecutorTests/runs/1233024008?check_suite_focus=true#step:5:25756

Starting test execution, please wait...

A total of 1 test files matched the specified pattern.
[xUnit.net 00:00:05.66]     ExecutorTest.UnitTest1.ShouldNotThrow [FAIL]
[xUnit.net 00:00:05.68]     ExecutorTest.UnitTest1.ShouldThrow [FAIL]
  X ExecutorTest.UnitTest1.ShouldNotThrow [295ms]
  Error Message:
   Exit code was -2147450749 but it should have been 42
Expected: True
Actual:   False
  Stack Trace:
     at Microsoft.DotNet.RemoteExecutor.RemoteInvokeHandle.Dispose(Boolean disposing) in /_/src/Microsoft.DotNet.RemoteExecutor/src/RemoteInvokeHandle.cs:line 237
   at ExecutorTest.UnitTest1.ShouldNotThrow() in D:\a\RemoteExecutorTests\RemoteExecutorTests\ExecutorTest\UnitTest1.cs:line 23
  X ExecutorTest.UnitTest1.ShouldThrow [28ms]
  Error Message:
   Assert.Throws() Failure
Expected: typeof(Microsoft.DotNet.RemoteExecutor.RemoteExecutionException)
Actual:   (No exception was thrown)
  Stack Trace:
     at ExecutorTest.UnitTest1.ShouldThrow() in D:\a\RemoteExecutorTests\RemoteExecutorTests\ExecutorTest\UnitTest1.cs:line 13

Test Run Failed.
Total tests: 2
     Failed: 2
 Total time: 14.8005 Seconds
Execute managed assembly exit code: 0x1
Waiting for breadcrumb thread to exit...
Done waiting for breadcrumb thread to exit...
C:\Program Files\dotnet\sdk\3.1.402\Microsoft.TestPlatform.targets(32,5): error MSB4181: The "Microsoft.TestPlatform.Build.Tasks.VSTestTask" task returned false but did not log an error. [D:\a\RemoteExecutorTests\RemoteExecutorTests\ExecutorTest\ExecutorTest.csproj]
Execute managed assembly exit code: 0x1
Waiting for breadcrumb thread to exit...
Done waiting for breadcrumb thread to exit...
Execute managed assembly exit code: 0x1
Waiting for breadcrumb thread to exit...
Done waiting for breadcrumb thread to exit...
Execute managed assembly exit code: 0x1
Waiting for breadcrumb thread to exit...
Done waiting for breadcrumb thread to exit...
Error: Process completed with exit code 1.
3s

@danmoseley
Copy link
Member

Ah, I don't see the host logging there. Either it's not enabled, or not captured. Does host-tracing.md above help?

@JimBobSquarePants
Copy link
Author

JimBobSquarePants commented Oct 9, 2020

There's definitely a lot of extra logging 100s of extra lines, just nothing relevant (that I could see). Here's a previous run.

https://github.com/JimBobSquarePants/RemoteExecutorTests/runs/1232480114?check_suite_focus=true#step:5:1

I'll try moving to Core 3.1 and adding the extra verbosity command

@vitek-karas
Copy link
Member

The host trace only got all of the "dotnet test" stuff (starts 4 processes internally) + the powershell (which is I guess what runs the whole thing).
But the remote execution is not visible there.
So either the stderr output of the remote executor is not forwarded to the output of the test, or the environment was not set (COREHOST_TRACE).
You can also set COREHOST_TRACE=1 and COREHOST_TRACEFILE=file.txt and then collect that file - if that's somehow possible via the CI.

I assume this doesn't have a local repro... right?

@vitek-karas
Copy link
Member

As far as I can tell the remove executor has no way to capture stdout or stderr of the child process it spawns - so normal CORHOST_TRACE=1 won't work.
I would try:

  • Just in the test before running the remote executor, set COREHOST_TRACE=1 and COREHOST_TRACEFILE=tempfilepath.
  • Run the remote executor
  • Once it's done, open the "tempfilepath" and write it all to the output of the test (I don't know if Console.WriteLine will work, maybe?)

That should give us a way to get the trace.

(And would disable the env. variables set for the entire run, as that just creates noise which is not useful).

@JimBobSquarePants
Copy link
Author

JimBobSquarePants commented Oct 9, 2020

Thanks @vitek-karas

Here's the result of running the test locally with the following code (I'm assuming that's what you meant?). I haven't checked the output via CI yet. (I've had two people independently check and confirm on their Windows machines in addition to my local device and CI.)

[Fact]
public void ShouldThrow()
{
    var tracePath = Path.Combine(Path.GetTempPath(), $"remoteexecutor{DateTime.Now:yyyyMMdd}.txt");
    Assert.True(RemoteExecutor.IsSupported);

    var psi = new ProcessStartInfo();
    psi.Environment[$"COREHOST_TRACE"] = "1";
    psi.Environment[$"COREHOST_TRACEFILE"] = tracePath;

    Assert.Throws<RemoteExecutionException>(() =>
        RemoteExecutor.Invoke(() => Assert.True(false),
            new RemoteInvokeOptions
            {
                StartInfo = psi
            }).Dispose()
    );
}

Output is the same in Debug and Release modes. (Differs only in bin subpath)

Tracing enabled @ Fri Oct  9 23:23:31 2020 GMT
--- Invoked hostfxr_main_startupinfo [commit hash: 38017c3935de95d0335bac04f4901ddfc2718656]
Checking if CoreCLR path exists=[C:\Program Files\dotnet\coreclr.dll]
--- Executing in a native executable mode...
Using dotnet root path [C:\Program Files\dotnet]
App runtimeconfig.json from [C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.dll]
Runtime config is cfg=C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.json dev=C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.dev.json
Attempting to read runtime config: C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.json
Attempting to read dev runtime config: C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.dev.json
Runtime config [C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.json] is valid=[1]
Executing as a self-contained app as per config file [C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.json]
--- Resolving hostpolicy.dll version from deps json [C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.deps.json]
Cannot use file stream for [C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.deps.json]: No such file or directory
The expected hostpolicy.dll directory is [C:\Program Files\dotnet]
The hostpolicy.dll was not found in [C:\Program Files\dotnet]
A fatal error was encountered. The library 'hostpolicy.dll' required to execute the application was not found in 'C:\Program Files\dotnet'.
Failed to run as a self-contained app.
  - The application was run as a self-contained app because 'C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.json' was not found.
  - If this should be a framework-dependent app, add the 'C:\development\github\JimBobSquarePants\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.runtimeconfig.json' file and specify the appropriate framework.

@vitek-karas
Copy link
Member

Thanks - I was able to repro it locally.

The command line it executes doesn't seem to make sense to me:

F:\AppModel\repro\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\testhost.exe 
exec 
--runtimeconfig "F:\AppModel\repro\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\ExecutorTest.runtimeconfig.json" 
--depsfile "F:\AppModel\repro\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\ExecutorTest.deps.json" 
"F:\AppModel\repro\RemoteExecutorTests\ExecutorTest\bin\Debug\netcoreapp3.1\Microsoft.DotNet.RemoteExecutor.dll"
<args to the remote executor - not interesting>

This would work if it were running "dotnet.exe exec ...", but I doubt testhost recognizes "exec" (I could be wrong though).

In any case the command line won't work because the special host parameters must be first on the command line: --runtimeconfig and so on. Since testhost.exe is an apphost it doesn't inherently recognize exec on the host level, and so it runs itself as if there were no special parameters on the command line. And thus it tries to read the file testhost.runtimeconfig.json which doesn't exist (the valid=[1] is misleading, internally the host treats missing runtime config as valid) - so it ends up with empty runtime config which means no framework references and thus tries to run as self-contained - which it's not and so on.

I tried running the same command line via dotnet.exe exec and that works (well at least it runs the code).

I don't know how this passes tests (if they run) in Arcade repo and/or how it's used in a case where it would actually work. But as is, this seems completely broken.

Also - I which we could update the testhost.exe to some "recent" version, it's 2.1 (I know we probably need it to be able to run 2.1 tests, unfortunately). In this case it means that the tracing from the testhost.exe is not visible in the trace file since it doesn't recognize COREHOST_TRACEFILE which was added in 3.0.

I think somebody from Arcade should comment on what is the intended command line. I can definitely help with figuring out what it should be in reality to get all the desired behaviors.

Couple of notes about what I think it SHOULD be doing:

  • Using testhost.exe won't work because that is the VS test host runner, it does not support running the executor dll as a command line app
  • It should be using dotnet.exe - but this will be tricky because it will need to pick the right one. The best way would probably be to use location of the runtime running the test (this will be tricky to get completely right with single-file, but it's possible).

The break was probably introduced by changes to the VSTest platform - some time ago it switched from running the test via "dotnet.exe" to using the "testhost.exe". But that's been quite a while now if I remember correctly.

@danmoseley
Copy link
Member

@ViktorHofer might have context

@JimBobSquarePants
Copy link
Author

Thanks for the help so far. Thought I was going mad!

@ViktorHofer
Copy link
Member

Thanks for reporting @JimBobSquarePants and @vitek-karas for the analysis. Vitek is right that RemoteExecutor doesn't set HostRunner (the path to the host to invoke) to the right value in cases where VSTest's apphost is used. This didn't show up for us in dotnet/runtime as we don't use VSTest's apphost feature as we supply our custom host.

As a simple fallback mechanism we could check if processFileName ends with (dotnet[.exe]) and if not, use the dotnet in the %PATH%.

For a more complete implementation, we should follow what Vitek proposes:

It should be using dotnet.exe - but this will be tricky because it will need to pick the right one. The best way would probably be to use location of the runtime running the test (this will be tricky to get completely right with single-file, but it's possible).

An implementation of that could look like this:

  1. Retrieve the location of the shared framework assemblies currently being used by the process.
  2. Climb up the directory tree until a dotnet host is found.
  3. If none is found, use the one in the PATH.

@vitek-karas do you think we should stop at some point climbing up the directory tree? For self-contained application we need to fallback to %PATH% as a self-contained application doesn't have a dotnet host. Right?

@vitek-karas
Copy link
Member

Regarding testing: I was wondering why the RemoteExecuter tests are passing: https://github.com/dotnet/arcade/blob/master/src/Microsoft.DotNet.RemoteExecutor/tests/RemoteExecutorTests.cs
I guess it's because they target netcoreapp2.1. Maybe we should multi-target these to get better coverage. The RemoteExecutor is supposed to run on pretty much anything (it will use the parent test's TFM), so it would be good to have somewhat wider test coverage.

As for the fix:

If the parent test was executed via dotnet.exe then use that - this is just like you suggested, check the process file name for dotnet.exe.
If the parent test was executed via apphost it is much more tricky:
We could use for example typeof(object).Assembly.Location and walk up to find the dotnet.exe - this can be deterministic, we know that the dotnet.exe must be 3 folders above (/shared/Microsoft.NETCore.App/5.0.0 for example). If we can find it there, use that.
It the above works, it would work for both tests using globally installed runtime/frameworks, as well as tests using private installs.

The problem is all the cases where the above will not work:

  • Self-contained apps - there's no dotnet.exe - we would have to bring along our own executable (apphost) which is basically what testhost.exe does for VSTest (although it's not trying to solve the self-contained problem, but it would solve this as well). That said we would have to solve the same problem - match the bitness on Windows (and low enough version to work everywhere). This could be also solved via custom native host using native hosting APIs - but it's probably easier to use the testhost solution. This would also potentially work for 3.1 single-file and 5.0 backward compat single-file tests.
  • Single-file apps - there's really no good way to do this cleanly currently - maybe one day when native hosting APIs will support single-file apps hosting, then it might be possible. See below for a crazy idea...
  • Trimmed apps - this is "broken by design" unless we somehow cheat. Basically the problem is that linker would not see the code running in the external process (the RemoteExecutor parts, it would see the parts of the tests) and remove it and all of its dependencies. So we would have to somehow force linker to include those parts. It might be possible to "include" the necessary parts behind some condition which linker won't figure out. But this is lower priority - trimmed unit tests are not a good solution to testing.
  • Future AOT - this is probably the most problematic one. I don't see a good solution (but it would depend on the specific AOT).

Crazy idea: Use startup hook. This would in theory work everywhere where startup hook is correctly supported which is currently basically everywhere except for trimmed apps and AOT. https://github.com/dotnet/runtime/blob/master/docs/design/features/host-startup-hook.md
Basically:

  • We want to run the parent test code, but in a new process
  • We can rerun the test executable, but its Main will not work and we won't get to run the code we want
  • If we get to run our code before Main, it could work

Startup hook would allow the remote executor to run its own code before Main is executed. It would run the test there. Once done, it would force terminate the process - never getting to Main.
Not the cleanest solution (using native hosting APIs allows something very similar, but without the hacks - but it doesn't support single-file and would require native code), but it would work even for single-file - regardless of the type of single-file used.

/cc @agocke - who is looking into running tests as single-file

@JimBobSquarePants
Copy link
Author

Is there anything useful I can do to expediate a fix here?

@vitek-karas
Copy link
Member

@JimBobSquarePants We need to agree on the approach. If you want to spend some time on this, I think prototyping one of the proposed solutions above would be a good start - to see if it really works and how hard/complex of a solution it is.

Ideally we would pick a solution which we can extend going forward to as many scenarios as possible, but that can be long term. Short term we could go with a simple fix for whatever problem we're hitting currently. The simplest right now if probable fixing the code to find the right "dotnet.exe" (if available) and use that. That should fix the scenario described in this issue.

@JimBobSquarePants
Copy link
Author

I very much doubt I have the domain knowledge required to implement a fix tbh. Was thinking more test scenarios.

@RussKie
Copy link
Member

RussKie commented May 21, 2021

We're hitting the same issue in dotnet/winforms... And this is kind of blocking us too.

There is nothing in our configs (test or otherwise) that sets "self contained" flag, so "Failed to run as a self-contained app" is kind of puzzling.

Tracing enabled @ Fri May 21 02:39:23 2021 GMT
--- Invoked hostfxr_main_startupinfo [commit hash: d49bcbe0441f5c954cddcbe28a222eb34917bcaf]
Checking if CoreCLR path exists=[C:\Development\winforms\.dotnet\coreclr.dll]
--- Executing in a native executable mode...
Using dotnet root path [C:\Development\winforms\.dotnet]
App runtimeconfig.json from [C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.dll]
Runtime config is cfg=C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json dev=C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.dev.json
Attempting to read runtime config: C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json
Attempting to read dev runtime config: C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.dev.json
Runtime config [C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json] is valid=[1]
Executing as a self-contained app as per config file [C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json]
--- Resolving hostpolicy.dll version from deps json [C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.deps.json]
Cannot use file stream for [C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.deps.json]: No such file or directory
The expected hostpolicy.dll directory is [C:\Development\winforms\.dotnet]
The hostpolicy.dll was not found in [C:\Development\winforms\.dotnet]
A fatal error was encountered. The library 'hostpolicy.dll' required to execute the application was not found in 'C:\Development\winforms\.dotnet'.
Failed to run as a self-contained app.
  - The application was run as a self-contained app because 'C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json' was not found.
  - If this should be a framework-dependent app, add the 'C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json' file and specify the appropriate framework.

invokerHandle.Options.StartInfo.FileName

C:\Users\igveliko\.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.exe

invokerHandle.Options.StartInfo.Arguments

exec --runtimeconfig "C:\Development\winforms\artifacts\bin\System.Windows.Forms.Tests\Debug\net6.0\System.Windows.Forms.Tests.runtimeconfig.json" --depsfile "C:\Development\winforms\artifacts\bin\System.Windows.Forms.Tests\Debug\net6.0\System.Windows.Forms.Tests.deps.json" "C:\Development\winforms\artifacts\bin\System.Windows.Forms.Tests\Debug\net6.0\Microsoft.DotNet.RemoteExecutor.dll" "System.Windows.Forms.Tests, Version=42.42.42.42, Culture=neutral, PublicKeyToken=b77a5c561934e089" System.Windows.Forms.Tests.ListView_NativeImageListTests+<>c <NativeImageList_finalizer_releases_native_handle>b__1_0 C:\Users\igveliko\AppData\Local\Temp\yrsixegt.j14 

@vitek-karas
Copy link
Member

  • The application was run as a self-contained app because 'C:\Users\igveliko.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json' was not found.
  • If this should be a framework-dependent app, add the 'C:\Users\igveliko.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json' file and specify the appropriate framework.

I think this nicely describes what happened and why it was trying to run the app as self-contained.

As for the solutions mentioned above - I actually like the startup hook idea even more now. It would get us the exact same environment as the original test, and I don't feel like this is too hacky anymore.

@antonfirsov
Copy link
Member

antonfirsov commented May 21, 2021

I just started to work on a minimalistic fix before @RussKie commented. There are more and more PR-s in ImageSharp with out-of-process tests, and our coverage became very unreliable. We need a fix ASAP. I don't have the time and the knowledge for a full-scale solution using startup hooks.

I believe the most urgent thing is to get this fixed for the common Windows cases as simply and as quickly as possible. (Climb up to find dotnet.exe, keep original HostRunner if it's not present.) This would unblock ImageSharp, and hopefully also @RussKie, without side effects on more exotic cases, and does not conflict with a future reimplementation with startup hooks. @vitek-karas will you accept such a PR?

Alternative idea: Remove readonly modifier from HostRunner, so users can configure it as needed.

@vitek-karas
Copy link
Member

I don't have any problem with a partial fix.

I'm curious: In your comment you mention that it makes your coverage unreliable. Does it mean that it fails only sometimes, or that you have to avoid running certain configurations entirely because of this?

@antonfirsov
Copy link
Member

antonfirsov commented May 21, 2021

Does it mean that it fails only sometimes, or that you have to avoid running certain configurations entirely because of this?

We keep running all configurations (hoping in an eventual fix), it's just that the Windows ones are always passing, regardless of the actual outcome of the body function. IMO it's matter of time until it leads to false positives because of a platform-specific issue, and I'm afraid we will be rolling out buggy packages to NuGet because of this, thus the use of the term "unreliable coverage". (@JimBobSquarePants can correct me if there's something wrong in my thinking.)

@JimBobSquarePants
Copy link
Author

it's just that the Windows ones are always passing, regardless of the actual outcome of the body function

Yep. That's what is happening. We were passing buggy code before we noticed. It's a growing elephant as we introduce more and more performance features.

@RussKie
Copy link
Member

RussKie commented May 23, 2021

  • The application was run as a self-contained app because 'C:\Users\igveliko.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json' was not found.
  • If this should be a framework-dependent app, add the 'C:\Users\igveliko.nuget\packages\microsoft.testplatform.testhost\16.5.0\build\netcoreapp2.1\x64\testhost.runtimeconfig.json' file and specify the appropriate framework.

I think this nicely describes what happened and why it was trying to run the app as self-contained.

Haha, if only I read the test log :)
But to be fair, this is a developer-centric error message, which is to a user (in this instance me or @JimBobSquarePants) carries little value, because there's nothing the user can do to address the issue besides coming here and raise an issue.

As for the solutions mentioned above - I actually like the startup hook idea even more now. It would get us the exact same environment as the original test, and I don't feel like this is too hacky anymore.

This approach shifts the onus of bootstrapping to the consumer of the RE. Whilst this may fair to suggest this direction for custom scenarios you'll need to convince me why running a test with the RE from under VS is not a mainstream scenario.

@agocke
Copy link
Member

agocke commented May 24, 2021

My understanding is that this is Arcade-only functionality, so the experience is not polished like a shipping product.

At the same time, I don't really understand what this runner is supposed to do. Try to replicate the currently loaded assembly as best as possible in a new process, then call it? @vitek-karas's right that this won't work for Native AOT-like scenarios, but I think that's because this idea is built on some basic principles:

  • the test is running code in a particular assembly
  • that assembly exists on disk and can be re-hosted in another process
  • we can reasonably reconstruct the existing process state for the new process

These all seem like pretty shaky principles to me, when looked at from a variety of configurations. I'd want to look closer at all of them before I shipped a solution like this.

@RussKie
Copy link
Member

RussKie commented May 24, 2021

My understanding is that this is Arcade-only functionality,

I'm afraid it is hard to agree with this statement - https://github.com/dotnet/runtime/search?q=RemoteExecutor yields 187 results. The RE is also explained in the .NET runtime docs. In winforms we have started using it in late 2019, and at no point it was called out as "arcade specific, don't use".

...so the experience is not polished like a shipping product.

IMO the issue is not about this. The issue is that we found an issue that doesn't appear to work in VS (the main scenario), and about fixing it.

@danmoseley
Copy link
Member

This tool originated in dotnet/runtime before it was generalized here. It's used throughout the unit tests to isolate tests that affect process wide state, or need to have special process state, and also in some special cases eg a test for fail fast.

@jkotas
Copy link
Member

jkotas commented May 24, 2021

My understanding is that this is Arcade-only functionality, so the experience is not polished like a shipping product.

Right, the remote executor is not polished like a shipping product. It is built on shaky principles, but that is ok since it is only meant to be good enough for limited .NET team own needs, like everything else in the Arcade repo.

As of right now, the remote executor does not work and it is automatically disabled for single file deployment (NativeAOT is just one particular type of single file deployment) and it is also disabled on browser and mobile OSes. See IsSupported property in https://github.com/dotnet/arcade/blob/main/src/Microsoft.DotNet.RemoteExecutor/src/RemoteExecutor.cs. Fixing that is tracked by #5129.

@agocke
Copy link
Member

agocke commented May 24, 2021

Given the current project status, the patch that Vitek suggested to work-around issues in .NET core projects seems reasonable. In short, I wouldn't put in much more work than necessary to achieve the immediate needs.

@RussKie
Copy link
Member

RussKie commented May 24, 2021

the patch that Vitek suggested to work-around issues in .NET core projects seems reasonable.

Which one?

@agocke
Copy link
Member

agocke commented May 24, 2021

Looking around for a dotnet host as implemented in dotnet/runtime#53125

@RussKie
Copy link
Member

RussKie commented May 24, 2021

Ah, you mean #7426 :)
👍

@vitek-karas
Copy link
Member

This approach shifts the onus of bootstrapping to the consumer of the RE. Whilst this may fair to suggest this direction for custom scenarios you'll need to convince me why running a test with the RE from under VS is not a mainstream scenario.

I don't understand this to be honest. The proposal using startup hooks would be that RE implements its "client" as a startup hook assembly (as oppose to entry point assembly the way it's now). When it runs the client process it would setup the environment to load the client component as a startup hook (it already sets up other environment so this is not a big change). The startup hook would then perform the actual test run and exit the process. The reason why this approach is cleaner is that startup hooks are a fully supported feature to inject code into (almost) any application - lot less need for hacking things to load the test code into a new app.

There should be no visible change to the user of RE, it would just change the implementation of RE.

As for the rest of the comments - this is absolutely not a shipping product. And it's expected that certain things won't work, since we didn't need them to work yet.

@RussKie
Copy link
Member

RussKie commented May 24, 2021

I guess I misinterpreted your comment, as it wasn't particular obvious if the hook should be a part of the Arcade/RemoteExecutor, or something a consumer should do.

@ViktorHofer
Copy link
Member

As for the rest of the comments - this is absolutely not a shipping product. And it's expected that certain things won't work, since we didn't need them to work yet.

100% agreed. Even though it's an open source project, there is no support policy attached to it. When I ported this library over from CoreFx couple years ago, I only did so to share the implementation with ML.NET, aspnetcore and later winforms. We try our best to resolve reported issues but RemoteExecutor was never intended to support every possible variant like single file applications, mobile, etc. As an example I removed the UWP support as well as we stopped testing it in CoreFx's main branch.

@ericstj ericstj added the area-Infrastructure-libraries Area maintained by .NET libraries team: APICompat, AsmDiff, GenAPI, GenFacades, PkgProj, etc label Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Infrastructure-libraries Area maintained by .NET libraries team: APICompat, AsmDiff, GenAPI, GenFacades, PkgProj, etc help wanted
Projects
None yet
Development

No branches or pull requests

10 participants