Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable failing jobs in PR testing. #1283

Merged
merged 2 commits into from
Jan 6, 2020

Conversation

sandreenko
Copy link
Contributor

Disable PR testing jobs that are failing.
Revert when #1097 and #1282 are fixed.

Question: how can we match yml names to ADO names nowadays? I guess that Checked JIT test executions matches to "CoreCLR Pri0 Test Run checked", but there are no common words there.

@sandreenko sandreenko added api-approved API was approved in API review, it can be implemented api-candidate-for-standard disabled-test The test is disabled in source code against the issue area-Infrastructure-coreclr labels Jan 3, 2020
@BruceForstall BruceForstall requested a review from trylek January 3, 2020 19:55
@BruceForstall
Copy link
Member

@sandreenko what are all the api- tags for?

@sandreenko sandreenko removed api-approved API was approved in API review, it can be implemented api-candidate-for-standard labels Jan 3, 2020
@sandreenko
Copy link
Contributor Author

@sandreenko what are all the api- tags for?

Thanks for catching this. Probably that was a miss click, I do not remember adding them. Sometimes GitHub does strange things when you use browser's "go back" while opening a PR/issue.

@@ -171,7 +171,7 @@ jobs:
jobTemplate: /eng/pipelines/coreclr/templates/crossgen-comparison-job.yml
buildConfig: checked
platforms:
- Linux_arm
# - Linux_arm Return this when https://github.com/dotnet/runtime/issues/1282 is fixed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest increasing the timeout, instead of disabling the submission entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

@sandreenko
Copy link
Contributor Author

Did not help, if we are talking about a temporary solution, can we disable the whole coreclr/pr pipeline?
We will lose crossgen comprising, a few platforms, but unload our ci and will see fewer timeouts.

Copy link
Member

@trylek trylek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current state of affairs, I suggest merging in the 1st version of your PR i.o.w. disabling the two offending PR tests (Windows ARM32 and Crossgen comparison) as upping the timeout for the Crossgen comparison test as suggested by @jashook doesn't seem to be helping. I'll follow up on Monday with Ilya based on Jarret's comments towards the ARM queue and we can re-enable the tests once the queues have been configured properly. Thanks for making this change!

@jashook
Copy link
Contributor

jashook commented Jan 4, 2020

The log looks very strange.

2020-01-04T01:18:11.8002260Z ##[section]Starting: Run native crossgen and compare output to baseline (Unix)
2020-01-04T01:18:11.8005652Z ==============================================================================
2020-01-04T01:18:11.8006032Z Task         : Command line
2020-01-04T01:18:11.8006239Z Description  : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
2020-01-04T01:18:11.8006475Z Version      : 2.151.2
2020-01-04T01:18:11.8006938Z Author       : Microsoft Corporation
2020-01-04T01:18:11.8007417Z Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
2020-01-04T01:18:11.8007919Z ==============================================================================
2020-01-04T01:18:12.3141973Z Generating script.
2020-01-04T01:18:12.3167231Z Script contents:
2020-01-04T01:18:12.3169187Z $BUILD_SOURCESDIRECTORY/eng/common/msbuild.sh $BUILD_SOURCESDIRECTORY/eng/common/helixpublish.proj /restore /t:Test /bl:$BUILD_SOURCESDIRECTORY/artifacts/log/$BuildConfig/SendToHelix.binlog
2020-01-04T01:18:12.3172837Z ========================== Starting Command Output ===========================
2020-01-04T01:18:12.3224098Z [command]/bin/bash --noprofile --norc /__w/_temp/d1759f07-ddfb-42ac-bb7f-7a16135b254b.sh
2020-01-04T01:18:12.7394689Z /__w/7/s/.dotnet/sdk/5.0.100-alpha1-015772/MSBuild.dll /nologo -distributedlogger:Microsoft.DotNet.Tools.MSBuild.MSBuildLogger,/__w/7/s/.dotnet/sdk/5.0.100-alpha1-015772/dotnet.dll*Microsoft.DotNet.Tools.MSBuild.MSBuildForwardingLogger,/__w/7/s/.dotnet/sdk/5.0.100-alpha1-015772/dotnet.dll -maxcpucount /m -verbosity:m /v:minimal /bl:/__w/7/s/artifacts/log/Checked/SendToHelix.binlog /clp:Summary /nr:true /p:TreatWarningsAsErrors=true /p:ContinuousIntegrationBuild=false /restore /t:Test /warnaserror /__w/7/s/eng/common/helixpublish.proj
2020-01-04T01:18:15.3634818Z   Restore completed in 424.23 ms for /__w/7/s/eng/common/helixpublish.proj.
2020-01-04T01:18:15.4068408Z   Restore completed in 2.37 ms for /__w/7/s/eng/common/helixpublish.proj.
2020-01-04T01:18:15.5149699Z   Starting Azure Pipelines Test Run (Ubuntu.1804.Arm32.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm32v7-30f6673-20190814153226
2020-01-04T01:18:15.8181441Z   Uploading payloads for Job on (Ubuntu.1804.Arm32.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm32v7-30f6673-20190814153226...
2020-01-04T01:18:15.8274473Z   Finished uploading payloads for Job on (Ubuntu.1804.Arm32.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm32v7-30f6673-20190814153226...
2020-01-04T01:18:15.8280059Z   Sending Job to (Ubuntu.1804.Arm32.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm32v7-30f6673-20190814153226...
2020-01-04T01:20:46.2552218Z   Sent Helix Job 795e00ec-e3a7-4a15-81dc-1143c2857390
2020-01-04T01:20:47.2645401Z   Waiting for completion of job 795e00ec-e3a7-4a15-81dc-1143c2857390
2020-01-04T03:04:41.7090810Z   Job 795e00ec-e3a7-4a15-81dc-1143c2857390 is completed with 2 finished work items.
2020-01-04T03:04:42.0258919Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(59,5): error : The response contained an invalid status code 404 Not Found [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.0261544Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(59,5): error :  [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.0262774Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(59,5): error : Body:  [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.2265999Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/azure-pipelines/AzurePipelines.MultiQueue.targets(30,5): error : Value cannot be null. (Parameter 'source') [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.2274816Z   Stopping Azure Pipelines Test Run (Ubuntu.1804.Arm32.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm32v7-30f6673-20190814153226
2020-01-04T03:04:42.3550828Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(75,5): error : Work item 795e00ec-e3a7-4a15-81dc-1143c2857390/WorkItem in job 795e00ec-e3a7-4a15-81dc-1143c2857390 has failed. [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3558444Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(75,5): error : Failure log: https://helix.dot.net/api/2019-06-17/jobs/795e00ec-e3a7-4a15-81dc-1143c2857390/workitems/WorkItem/console . [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3591800Z 
2020-01-04T03:04:42.3596411Z Build FAILED.
2020-01-04T03:04:42.3597120Z 
2020-01-04T03:04:42.3598598Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(59,5): error : The response contained an invalid status code 404 Not Found [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3600475Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(59,5): error :  [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3602126Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(59,5): error : Body:  [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3603922Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/azure-pipelines/AzurePipelines.MultiQueue.targets(30,5): error : Value cannot be null. (Parameter 'source') [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3606554Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(75,5): error : Work item 795e00ec-e3a7-4a15-81dc-1143c2857390/WorkItem in job 795e00ec-e3a7-4a15-81dc-1143c2857390 has failed. [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3608409Z /home/helixbot_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/5.0.0-beta.19617.1/tools/Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(75,5): error : Failure log: https://helix.dot.net/api/2019-06-17/jobs/795e00ec-e3a7-4a15-81dc-1143c2857390/workitems/WorkItem/console . [/__w/7/s/eng/common/helixpublish.proj]
2020-01-04T03:04:42.3609249Z     0 Warning(s)
2020-01-04T03:04:42.3609853Z     4 Error(s)
2020-01-04T03:04:42.3610169Z 
2020-01-04T03:04:42.3610503Z Time Elapsed 01:46:29.55
2020-01-04T03:04:42.3883661Z Build failed (exit code '1').
2020-01-04T03:04:42.4019268Z ##[error]Bash exited with code '1'.
2020-01-04T03:04:42.4426946Z ##[section]Finishing: Run native crossgen and compare output to baseline (Unix)

@jashook
Copy link
Contributor

jashook commented Jan 4, 2020

Feel free to merge disabling the job to get to a good state, but this looks like it should be investigated.

@trylek
Copy link
Member

trylek commented Jan 4, 2020

@jashook - I was originally under the impression that this is a secondary effect of the Helix run timing out so that the result scan fails to find its output data. If this assumption is incorrect, we definitely need to figure out what's going on. Actually, even if my understanding is correct, we should improve the logging to make the cause more obvious, I completely agree this is pretty cryptic and hard to action on.

@jashook
Copy link
Contributor

jashook commented Jan 4, 2020

I do not disagree with it being cryptic, but the thing that has me confused is that it looks like the jobs finished. See:

2020-01-04T01:20:47.2645401Z Waiting for completion of job 795e00ec-e3a7-4a15-81dc-1143c2857390
2020-01-04T03:04:41.7090810Z Job 795e00ec-e3a7-4a15-81dc-1143c2857390 is completed with 2 finished work items.

@jashook
Copy link
Contributor

jashook commented Jan 4, 2020

I just do not know what the time that they finished was, and what how long it took to do whatever it was doing after that point.

@sandreenko sandreenko force-pushed the makePRTestingVisiableGreen branch from ec87b06 to 8211f08 Compare January 6, 2020 02:05
@sandreenko sandreenko merged commit 74fa5d9 into dotnet:master Jan 6, 2020
@sandreenko sandreenko deleted the makePRTestingVisiableGreen branch January 6, 2020 02:06
@@ -111,7 +111,7 @@ jobs:
platforms:
- OSX_x64
- Linux_x64
- Linux_arm
# - Linux_arm return this when https://github.com/dotnet/runtime/issues/1097 is fixed.
Copy link
Contributor

@jashook jashook Jan 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this commented out. The comment beside it seems to be for Windows am but this is Linux arm

@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this pull request Dec 9, 2021
Fixes dotnet#1283.

Also, small improvement wrt warnings and IsGenericType.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure-coreclr disabled-test The test is disabled in source code against the issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants