Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT rolling build can build and upload non-JIT changes #64392

Closed
BruceForstall opened this issue Jan 27, 2022 · 5 comments · Fixed by #64480
Closed

JIT rolling build can build and upload non-JIT changes #64392

BruceForstall opened this issue Jan 27, 2022 · 5 comments · Fixed by #64480
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@BruceForstall
Copy link
Member

BruceForstall commented Jan 27, 2022

As described here:

The jitrollingbuild pipeline is set to build every time the JIT changes, and not batch changes:

trigger:
  batch: false
  branches:
    include:
    - main
  paths:
    include:
    - src/coreclr/jit/*
    - src/coreclr/inc/jiteeversionguid.h

However, it appears that AzDO doesn't kick off the build immediately; I noticed a delay of about 10 minutes for this particular change. In a few previous cases for JIT changes, there were also delays before the AzDO job kicked off, and in those cases, another change was merged before the kickoff, and AzDO built that change, not the JIT change that kicked off the job. Because of this, when we upload the JIT to Azure Storage, we use a git hash of a non-JIT change, and when superpmi.py looks for a baseline JIT, it doesn't find the hash of the JIT that should have been built.

Summary: there is an assumption that the pipeline builds every JIT change at the git hash of the JIT change. That doesn't appear to be the case with AzDO. I'm not sure if there's a config to force that to happen. E.g., what if 2 JIT changes were merged at almost exactly the same time? AzDO should build both of them, but does it?

Possibly we could change the jitrollingbuild.py upload script to walk back the git log for the built change and use the first hash that actually has JIT changes in it, to work around this problem.

@BruceForstall BruceForstall added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 27, 2022
@BruceForstall BruceForstall added this to the 7.0.0 milestone Jan 27, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 27, 2022
@ghost
Copy link

ghost commented Jan 27, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

As described here:

The jitrollingbuild pipeline is set to build every time the JIT changes, and not batch changes:

trigger:
  batch: false
  branches:
    include:
    - main
  paths:
    include:
    - src/coreclr/jit/*
    - src/coreclr/inc/jiteeversionguid.h

However, it appears that AzDO doesn't kick off the build immediately; I noticed a delay of about 10 minutes for this particular change. In a few previous cases for JIT changes, there were also delays before the AzDO job kicked off, and in those cases, another change was merged before the kickoff, and AzDO built that change, not the JIT change that kicked off the job. Because of this, when we upload the JIT to Azure Storage, we use a git hash of a non-JIT change, and when superpmi.py looks for a baseline JIT, it doesn't find the hash of the JIT that should have been built.

Summary: there is an assumption that the pipeline builds every JIT change at the git hash of the JIT change. That doesn't appear to be the case with AzDO. I'm not sure if there's a config to force that to happen. E.g., what if 2 JIT changes were merged at almost exactly the same time? AzDO should build both of them, but does it?

Possibly we could change the jitrollingbuild.py upload script to walk back the git log for the built change and use the first hash that actually has JIT changes in it, to work around this problem.

Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: 7.0.0

@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label Jan 27, 2022
@BruceForstall BruceForstall self-assigned this Jan 27, 2022
@BruceForstall
Copy link
Member Author

@MattGal @hoyosjs @safern Do any of you understand how AzDO triggering works? In particular, why it triggers a build on a git hash after the thing I triggered.

BruceForstall added a commit to BruceForstall/runtime that referenced this issue Jan 29, 2022
When uploading a JIT rolling build, use the git hash of the most recent
JIT change, not the git hash that was actually built. This handles the case
where a JIT change kicked off an AzDO pipeline, but the pipeline didn't start
until after one or more additional changes were merged.

The AzDO pipelines appear to fetch with `-depth=20`, which should provide
enough history for almost any case.

Fixes dotnet#64392
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 29, 2022
BruceForstall added a commit that referenced this issue Jan 29, 2022
When uploading a JIT rolling build, use the git hash of the most recent
JIT change, not the git hash that was actually built. This handles the case
where a JIT change kicked off an AzDO pipeline, but the pipeline didn't start
until after one or more additional changes were merged.

The AzDO pipelines appear to fetch with `-depth=20`, which should provide
enough history for almost any case.

Fixes #64392
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 29, 2022
@MattGal
Copy link
Member

MattGal commented Jan 31, 2022

@MattGal @hoyosjs @safern Do any of you understand how AzDO triggering works? In particular, why it triggers a build on a git hash after the thing I triggered.

Apologies, I started parsing that sentence and got lost. In most cases though I believe it's just a webhook or equivalent from GH to AzDO, which sometimes can fail to be delivered, get multi-delivered, or be delayed. Can you give examples on the build that got triggered that you're curious about?

@BruceForstall
Copy link
Member Author

One example was:

https://dev.azure.com/dnceng/internal/_build/results?buildId=1577232&view=results

The triggering merge was https://dev.azure.com/dnceng/internal/_git/dotnet-runtime/commit/d55a1dfa4e39a26a8ea8d161f733c72805f31f7f at 4:18pm.

There was a closely following merge https://dev.azure.com/dnceng/internal/_git/dotnet-runtime/commit/9eb90511ff6d079fa02832936c693d36289fc02b at 4:25pm.

The pipeline didn't start running until 4:36pm, and picked up the change at 4:25pm.

We had code that depended on the git hash being built by the pipeline being the same as the triggering merge, but this wasn't true.

I added some compensating code in #64480 that will look backwards in the git history on the build to find the triggering change (specifically, to find the most recent change to the JIT source directory; this needs to match the YML file "trigger" path to be fully correct).

@MattGal
Copy link
Member

MattGal commented Jan 31, 2022

This surely does seem like a bug in the way non-batched pipeline triggers are working, so I started an email thread with the pipelines PM and I'll update back here if anything interesting falls out of that conversation.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants