Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-extra-platforms rolling build failure: Docker Internal Server Error - The container operating system does not match the host operating system. #67728

Closed
carlossanlop opened this issue Apr 8, 2022 · 19 comments · Fixed by #69480
Labels
area-Infrastructure blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'

Comments

@carlossanlop
Copy link
Member

Seen in the two latest rolling builds from runtime-extra-platforms:

Both rolling builds had a general problem causing most tests to output this to the log:

Console log: 'Microsoft.Extensions.Configuration.Xml.Tests' from job 6602cf93-0be3-4663-a60d-e81987ec3e49 (windows.amd64.server2022.open) using docker image mcr.microsoft.com/dotnet-buildtools/prereqs:windowsservercore-2004-helix-amd64-20200904200251-272704c on a001B54
running %HELIX_CORRELATION_PAYLOAD%\scripts\414866fc2fa14b85a4c4464ddf625f32\execute.cmd in C:\h\w\A56B0923\w\ABBF0971\e max 900 seconds

Output:

Exit Code:-900

The run_client.py log from artifacts show this:

2022-04-06T22:48:20.058Z	ERROR  	dockerhelper(233)	run	API Error attempting to talk to Docker Server:500 Server Error: Internal Server Error ("CreateComputeSystem 9098476f85a32e792494273ed6ffbdda89e2eaa58e6afaf0e6078dc89ebefbf4: The container operating system does not match the host operating system.
(extra info: {"SystemType":"Container","Name":"9098476f85a32e792494273ed6ffbdda89e2eaa58e6afaf0e6078dc89ebefbf4","Owner":"docker","VolumePath":"\\\\?\\Volume{657b1847-3ff3-4c95-89f0-1b5e5411847e}","IgnoreFlushesDuringBoot":true,"LayerFolderPath":"C:\\ProgramData\\docker\\windowsfilter\\9098476f85a32e792494273ed6ffbdda89e2eaa58e6afaf0e6078dc89ebefbf4","Layers":[{"ID":"274808d8-c4a3-528d-bf38-5bc4d461b846","Path":"C:\\ProgramData\\docker\\windowsfilter\\072f4aae481ffe6f2d48c9a39da42b0b39d1574d23b4a4fc82dc44acb2e9de2b"},{"ID":"99645de1-1409-5d0a-b552-4e1371e5a60e","Path":"C:\\ProgramData\\docker\\windowsfilter\\70f9fcabe746be1c425e0ffba76d8b982c1cdc8d2737cd60adc7fc1aaaa70d30"},{"ID":"7c6c0748-5ffd-5279-b129-e3c5ae923d8e","Path":"C:\\ProgramData\\docker\\windowsfilter\\c5d218a45573c713adfd6bbc30e7f06d517d3fcd04753f94cd7519af80774603"},{"ID":"b93eb741-0f60-5b94-8bb5-8ac866f67937","Path":"C:\\ProgramData\\docker\\windowsfilter\\390e650a559fcbf01ea9b6a2c0ab9c995e40c5bc3649ef7be740fc141d501baf"},{"ID":"aace227d-c83e-5055-9c4a-75cc06886df6","Path":"C:\\ProgramData\\docker\\windowsfilter\\80ae038a6f851f11826739f7c4eab1b25554a93d6294ad507429fd3ec2bea6cc"},{"ID":"4c612d25-6193-576e-b50c-4f2a075b4061","Path":"C:\\ProgramData\\docker\\windowsfilter\\e2c8aa54c24a31681076052d879682b5eb69c1b39b2a4455ffee9eedd6a9d4ac"}],"HostName":"Da001B53","MappedDirectories":[{"HostPath":"c:\\users\\runner\\appdata\\local\\temp\\66492a4b73304872927afce22db1de83","ContainerPath":"c:\\commands","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\scripts\\helix-scripts-no-deps","ContainerPath":"c:\\helix\\scripts","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\cores","ContainerPath":"c:\\helix\\cores","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\logs","ContainerPath":"c:\\helix\\logs","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\config","ContainerPath":"c:\\helix\\config","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\p","ContainerPath":"c:\\helix\\work\\correlation","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\w\\b7140a28\\u","ContainerPath":"c:\\helix\\work\\workitem\\u","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\w\\b7140a28\\e","ContainerPath":"c:\\helix\\work\\workitem\\e","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\w\\b7140a28\\uploads","ContainerPath":"c:\\helix\\work\\workitem\\uploads","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false}],"HvPartition":false,"EndpointList":["5640acae-c836-4c9d-9308-33ce846942cf"],"AllowUnqualifiedDNSQuery":true})")
2022-04-06T22:48:20.060Z	ERROR  	dockerhelper(234)	run	Docker may not be configured correctly on this machine.  Contact dnceng for help.

@MattGal @danmoseley @ilyas1974

@carlossanlop carlossanlop added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Apr 8, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 8, 2022
@ghost
Copy link

ghost commented Apr 8, 2022

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

Seen in the two latest rolling builds from runtime-extra-platforms:

Both rolling builds had a general problem causing most tests to output this to the log:

Console log: 'Microsoft.Extensions.Configuration.Xml.Tests' from job 6602cf93-0be3-4663-a60d-e81987ec3e49 (windows.amd64.server2022.open) using docker image mcr.microsoft.com/dotnet-buildtools/prereqs:windowsservercore-2004-helix-amd64-20200904200251-272704c on a001B54
running %HELIX_CORRELATION_PAYLOAD%\scripts\414866fc2fa14b85a4c4464ddf625f32\execute.cmd in C:\h\w\A56B0923\w\ABBF0971\e max 900 seconds

Output:

Exit Code:-900

The run_client.py log from artifacts show this:

2022-04-06T22:48:20.058Z	ERROR  	dockerhelper(233)	run	API Error attempting to talk to Docker Server:500 Server Error: Internal Server Error ("CreateComputeSystem 9098476f85a32e792494273ed6ffbdda89e2eaa58e6afaf0e6078dc89ebefbf4: The container operating system does not match the host operating system.
(extra info: {"SystemType":"Container","Name":"9098476f85a32e792494273ed6ffbdda89e2eaa58e6afaf0e6078dc89ebefbf4","Owner":"docker","VolumePath":"\\\\?\\Volume{657b1847-3ff3-4c95-89f0-1b5e5411847e}","IgnoreFlushesDuringBoot":true,"LayerFolderPath":"C:\\ProgramData\\docker\\windowsfilter\\9098476f85a32e792494273ed6ffbdda89e2eaa58e6afaf0e6078dc89ebefbf4","Layers":[{"ID":"274808d8-c4a3-528d-bf38-5bc4d461b846","Path":"C:\\ProgramData\\docker\\windowsfilter\\072f4aae481ffe6f2d48c9a39da42b0b39d1574d23b4a4fc82dc44acb2e9de2b"},{"ID":"99645de1-1409-5d0a-b552-4e1371e5a60e","Path":"C:\\ProgramData\\docker\\windowsfilter\\70f9fcabe746be1c425e0ffba76d8b982c1cdc8d2737cd60adc7fc1aaaa70d30"},{"ID":"7c6c0748-5ffd-5279-b129-e3c5ae923d8e","Path":"C:\\ProgramData\\docker\\windowsfilter\\c5d218a45573c713adfd6bbc30e7f06d517d3fcd04753f94cd7519af80774603"},{"ID":"b93eb741-0f60-5b94-8bb5-8ac866f67937","Path":"C:\\ProgramData\\docker\\windowsfilter\\390e650a559fcbf01ea9b6a2c0ab9c995e40c5bc3649ef7be740fc141d501baf"},{"ID":"aace227d-c83e-5055-9c4a-75cc06886df6","Path":"C:\\ProgramData\\docker\\windowsfilter\\80ae038a6f851f11826739f7c4eab1b25554a93d6294ad507429fd3ec2bea6cc"},{"ID":"4c612d25-6193-576e-b50c-4f2a075b4061","Path":"C:\\ProgramData\\docker\\windowsfilter\\e2c8aa54c24a31681076052d879682b5eb69c1b39b2a4455ffee9eedd6a9d4ac"}],"HostName":"Da001B53","MappedDirectories":[{"HostPath":"c:\\users\\runner\\appdata\\local\\temp\\66492a4b73304872927afce22db1de83","ContainerPath":"c:\\commands","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\scripts\\helix-scripts-no-deps","ContainerPath":"c:\\helix\\scripts","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\cores","ContainerPath":"c:\\helix\\cores","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\logs","ContainerPath":"c:\\helix\\logs","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\config","ContainerPath":"c:\\helix\\config","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\p","ContainerPath":"c:\\helix\\work\\correlation","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\w\\b7140a28\\u","ContainerPath":"c:\\helix\\work\\workitem\\u","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\w\\b7140a28\\e","ContainerPath":"c:\\helix\\work\\workitem\\e","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\a56b0923\\w\\b7140a28\\uploads","ContainerPath":"c:\\helix\\work\\workitem\\uploads","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false}],"HvPartition":false,"EndpointList":["5640acae-c836-4c9d-9308-33ce846942cf"],"AllowUnqualifiedDNSQuery":true})")
2022-04-06T22:48:20.060Z	ERROR  	dockerhelper(234)	run	Docker may not be configured correctly on this machine.  Contact dnceng for help.

@MattGal @danmoseley @ilyas1974

Author: carlossanlop
Assignees: -
Labels:

blocking-clean-ci, area-Infrastructure, untriaged

Milestone: -

@ilyas1974
Copy link

To keep you in the loop, this is what I found - "It depends on the isolation type being used. On Windows Server, process isolation is used by default. With process isolation, the host and container must be the same version. With Hyper-V isolation, the container can be the same version or a lower version than the host. This is described at https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility with some helpful tables."

I've created dotnet/arcade#8981 to investigate

@carlossanlop
Copy link
Member Author

Continues affecting the runtime-extra-platforms rolling builds from 2022/04/08:

308 failures: https://dev.azure.com/dnceng/public/_build/results?buildId=1706951&view=results
310 failures: https://dev.azure.com/dnceng/public/_build/results?buildId=1705858&view=results

@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Apr 8, 2022
@carlossanlop
Copy link
Member Author

Continues happening on runtime-extra-platforms consistently:

@AaronRobinsonMSFT
Copy link
Member

Seems to be occurring on #68615 regularly.

@elinor-fung
Copy link
Member

Note that #68615 failures were not on Windows Server 2022:

Job        : b493b6c6-de4b-452c-95a1-8639afa964e1
QueueAlias : windows.nano.1809.amd64.open
QueueName  : windows.10.amd64.serverrs5.open.rt
DockerTag  : mcr.microsoft.com/dotnet-buildtools/prereqs:nanoserver-1809-helix-amd64-08e8e40-20200107182504

@MattGal
Copy link
Member

MattGal commented Apr 28, 2022

Note that #68615 failures were not on Windows Server 2022:

Job        : b493b6c6-de4b-452c-95a1-8639afa964e1
QueueAlias : windows.nano.1809.amd64.open
QueueName  : windows.10.amd64.serverrs5.open.rt
DockerTag  : mcr.microsoft.com/dotnet-buildtools/prereqs:nanoserver-1809-helix-amd64-08e8e40-20200107182504

I think we have a bug from the other attempted fix; I'm trying some stuff and will reply back shortly.

@MattGal
Copy link
Member

MattGal commented May 2, 2022

Rolling back the change that added hyper-v isolation calls seems to have unblocked this.

@jakobbotsch
Copy link
Member

Closing as per above.

@elinor-fung
Copy link
Member

Rolling back the change that added hyper-v isolation calls seems to have unblocked this.

@MattGal Is the original issue that the hyper-v isolation was trying to fix expected to still be occurring? I'm having trouble tracking if it was fixed again after the rollback.

A number of runs are erroring with The container operating system does not match the host operating system.

From a recent test result:

Job        : 6cedab78-e2b9-43b4-831d-700eefabfc6f
QueueAlias : windows.server.core.2004.amd64.open
QueueName  : windows.10.amd64.server20h2.open.svc
DockerTag  : mcr.microsoft.com/dotnet-buildtools/prereqs:windowsservercore-2004-helix-amd64-20200904200251-272704c

In runclient.py.

ERROR  	dockerhelper(233)	run	API Error attempting to talk to Docker Server:500 Server Error: Internal Server Error ("CreateComputeSystem 06eeca59f558c5d4b8c40863b6cc71f9fcc05543294df95d271488935be4e97a: The container operating system does not match the host operating system.
(extra info: {"SystemType":"Container","Name":"06eeca59f558c5d4b8c40863b6cc71f9fcc05543294df95d271488935be4e97a","Owner":"docker","VolumePath":"\\\\?\\Volume{6f9c7d85-53dd-4f9c-9fab-89179f027c2c}","IgnoreFlushesDuringBoot":true,"LayerFolderPath":"C:\\ProgramData\\docker\\windowsfilter\\06eeca59f558c5d4b8c40863b6cc71f9fcc05543294df95d271488935be4e97a","Layers":[{"ID":"7040f6e6-c65e-5f20-9e6d-f49368ce0963","Path":"C:\\ProgramData\\docker\\windowsfilter\\27579cfbeb584c2161d20520ef914433f9c1ad2e030f7324f072f830b42e39e2"},{"ID":"cc80e49a-214c-5ccd-b7d7-d76a09605290","Path":"C:\\ProgramData\\docker\\windowsfilter\\af62c8bea705bbd2a6ab1379d760c5013fbeaed9faa2d6c1bc2d0d1f195b3f5d"},{"ID":"7ebae5b6-83fd-506d-bd5a-42368d649402","Path":"C:\\ProgramData\\docker\\windowsfilter\\5a9c7fa04646ce82ed34c34c44812dd55feff1f1f89b8ada259f1fa9cbc89411"},{"ID":"7bb0fd93-1b02-5cce-85c6-6e3bc50454fc","Path":"C:\\ProgramData\\docker\\windowsfilter\\c9aafd9ad37fe541de20ad48bf0a50ba61aa81bd3245e3f490d1b7c525dd9703"},{"ID":"3da1fd06-2181-551e-964d-fe24aac6a71d","Path":"C:\\ProgramData\\docker\\windowsfilter\\49a481001e9e2b58b0121e936768bf19a9ddb72f1bfec5da6980c655a5ba773b"},{"ID":"7def8f70-3146-5bb1-be07-4e84e009e5d4","Path":"C:\\ProgramData\\docker\\windowsfilter\\9c412e788fa18cd46a021972babd56739fe85e19aa894924239fabc3ed602a13"}],"HostName":"Da000AR7","MappedDirectories":[{"HostPath":"c:\\users\\runner\\appdata\\local\\temp\\64dd19f6c9534a0889e72471aa52da68","ContainerPath":"c:\\commands","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\scripts\\helix-scripts-no-deps","ContainerPath":"c:\\helix\\scripts","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\logs","ContainerPath":"c:\\helix\\logs","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\cores","ContainerPath":"c:\\helix\\cores","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\config","ContainerPath":"c:\\helix\\config","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\bc9a0a67\\p","ContainerPath":"c:\\helix\\work\\correlation","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\bc9a0a67\\w\\b38f09a3\\u","ContainerPath":"c:\\helix\\work\\workitem\\u","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\bc9a0a67\\w\\b38f09a3\\e","ContainerPath":"c:\\helix\\work\\workitem\\e","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false},{"HostPath":"c:\\h\\w\\bc9a0a67\\w\\b38f09a3\\uploads","ContainerPath":"c:\\helix\\work\\workitem\\uploads","ReadOnly":false,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false}],"HvPartition":false,"EndpointList":["475ec7b8-32d2-48ac-9929-a2339a9608c9"],"AllowUnqualifiedDNSQuery":true})")

@MattGal
Copy link
Member

MattGal commented May 16, 2022

@elinor-fung that IS the original issue I believe that it was trying to address; let me poke around and see if there's a less universally-breaking way to unblock you.

As a side note, did you know that Windows 20H1 (i.e. 2004) has been end of life since December 2021? I think just removing that might simplify things.

@elinor-fung
Copy link
Member

As a side note, did you know that Windows 20H1 (i.e. 2004) has been end of life since December 2021? I think just removing that might simplify things.

Oh, that sounds like a thing we should do. @dotnet/runtime-infrastructure is there a specific reason we use this version (for libraries-coreclr outerloop and for runtime-extra-platforms)? Or should we be moving to some newer Windows Server Core?

@MattGal
Copy link
Member

MattGal commented May 17, 2022

@elinor-fung I verified that Windows.10.Amd64.Server20H2/Windows.10.Amd64.Server20H2.Open works for your image. Note that this is only a short-term fix; 20H2 itself is end-of-life at the end of July 2022, so please pursue updating the image you use to work on a supported Server version.

@jakobbotsch
Copy link
Member

Sorry, I think I misunderstood what @MattGal was saying above, and I can see that this issue is not actually fixed on runtime-extra-platforms. All win-x64 runs there are failing with the issue in the original post.

We also have the runtime-libraries-coreclr outerloop-windows pipeline that suffers from the same problem in its win-x64 jobs.

Oh, that sounds like a thing we should do. https://github.com/orgs/dotnet/teams/runtime-infrastructure is there a specific reason we use this version (for libraries-coreclr outerloop and for runtime-extra-platforms)? Or should we be moving to some newer Windows Server Core?

Based on the discussions above it seems to me that we should move to a newer OS.

@jakobbotsch jakobbotsch reopened this May 17, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label May 17, 2022
@agocke
Copy link
Member

agocke commented May 17, 2022

Yup, moving to a newer OS sounds good. I don't know of any blockers for this, so we should go ahead and do it ASAP. I've started trying to build a list of the configurations we run on in https://github.com/dotnet/runtime/blob/main/docs/infra/test-configurations.md, so we can easily audit (and note any reasons why we're on that image, if necessary).

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label May 18, 2022
@premun
Copy link
Member

premun commented May 18, 2022

I've started trying to build a list of the configurations we run on in https://github.com/dotnet/runtime/blob/main/docs/infra/test-configurations.md, so we can easily audit (and note any reasons why we're on that image, if necessary).

@ilyas1974 is this maybe something that Matrix of Truth could help with?

@ilyas1974
Copy link

While I don't think would be something we add to the Matrix of Truth, I think it is something that we want to include in https://github.com/dotnet/core-eng/issues/6708.

@mkhamoyan
Copy link
Contributor

Issue continues happening on runtime-extra-platforms consistently in all builds (last one here )

From logs I can see "ERROR dockerhelper(233) run API Error attempting to talk to Docker Server:500 Server Error: Internal Server Error ("CreateComputeSystem 3eabe1175e89e82cc6ee266148d8b4c2d2281ef651a0bc2082e1200f573c43fd: The container operating system does not match the host operating system."

@ghost ghost removed untriaged New issue has not been triaged by the area owner in-pr There is an active PR which will close this issue when it is merged labels May 25, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jun 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Projects
None yet
9 participants