Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable PR-merge mode everywhere #3474

Closed
jglick opened this issue Mar 28, 2023 · 43 comments
Closed

Disable PR-merge mode everywhere #3474

jglick opened this issue Mar 28, 2023 · 43 comments

Comments

@jglick
Copy link

jglick commented Mar 28, 2023

Service(s)

ci.jenkins.io

Summary

According to #1355 (comment) the org folder config for Plugins still sets PR-merge mode (build the head of the PR merged with the current base branch revision). This is no longer the default (jenkinsci/github-branch-source-plugin#636) for various reasons, including extra load on the CI system which could be quite significant for us. (For example, as of this writing https://ci.jenkins.io/job/Plugins/job/blueocean-plugin/view/change-requests/job/PR-2248/ has had 90 builds despite jenkinsci/blueocean-plugin#2248 having had only one human interaction, a year ago.)

A plugin maintainer who is afraid of semantic merge conflicts in head can add a branch restriction requesting that PRs be up to date before merging; but much better is to use the beta merge queue system (https://github.com/orgs/community/discussions/46757#discussioncomment-5392779), which seems to be working in https://github.com/jenkinsci/build-token-root-plugin/queue/master at least.

@jglick jglick added the triage Incoming issues that need review label Mar 28, 2023
@timja
Copy link
Member

timja commented Mar 28, 2023

Given we had no complains on Core and Tools I can't see an issue with enabling this on plugins as well?

@timja
Copy link
Member

timja commented Mar 28, 2023

Updated. (Can easily revert if any concerns).

@timja timja closed this as completed Mar 28, 2023
@timja timja self-assigned this Mar 28, 2023
@jglick
Copy link
Author

jglick commented Mar 28, 2023

https://ci.jenkins.io/job/Plugins/job/blueocean-plugin/indexing/console as of this writing still shows PR-merge mode; I wonder if something needs to happen to tickle children into switching config. (Without EXTENDED_READ permission I cannot see for myself what https://ci.jenkins.io/job/Plugins/job/blueocean-plugin/configure looks like.)

@dduportal dduportal added this to the infra-team-sync-2023-04-04 milestone Mar 28, 2023
@dduportal dduportal removed the triage Incoming issues that need review label Mar 28, 2023
@timja
Copy link
Member

timja commented Mar 28, 2023

Possibly something to do with the build queue currently?
Might sort itself out if that runs through?

Blue ocean page you linked:
image

@jglick
Copy link
Author

jglick commented Mar 28, 2023

You might need to trigger a full rescan of https://ci.jenkins.io/job/Plugins/computation/console. I thought that you have been automatic after saving changes to https://ci.jenkins.io/job/Plugins/configure but maybe not? I do not think the build queue is relevant since indexing “runs on” the built-in node and should never be blocked.

@jglick
Copy link
Author

jglick commented Mar 28, 2023

(Recommend reopening until it is actually observed to have taken effect.)

@timja timja reopened this Mar 28, 2023
@timja
Copy link
Member

timja commented Mar 28, 2023

It was scanned at:
[Tue Mar 28 19:37:16 UTC 2023] Consulting GitHub Organization

@jglick
Copy link
Author

jglick commented Mar 28, 2023

https://ci.jenkins.io/job/Plugins/computation/consoleFull does not mention blueocean-plugin. However it does mention e.g. claim-plugin and https://ci.jenkins.io/job/Plugins/job/claim-plugin/indexing/console suggests that it correctly switched branch projects accordingly:

Checking pull request [#218](https://github.com/jenkinsci/claim-plugin/pull/218)
      ‘Jenkinsfile’ found
    Met criteria
Changes detected: PR-218 (2f2b3eb467f28ef704188a99aaf8988830d2b9a2+701cfc3f260a595321f9b550cb9d9402b66bb562 (ee5340acb09b8789f1b4dd028df2eb2bdd726c49) → 2f2b3eb467f28ef704188a99aaf8988830d2b9a2)
Scheduled build for branch: PR-218

It just means there is going to be a one-time build storm like https://ci.jenkins.io/job/Plugins/job/claim-plugin/view/change-requests/job/PR-218/2/ as we switch from PR-merge to PR-head. Not sure why the org folder indexing halted; perhaps Jenkins was shut down while it was still running?

@dduportal
Copy link
Contributor

Not sure why the org folder indexing halted; perhaps Jenkins was shut down while it was still running?

Yes it was, as part of #3459 (comment) alas :|

@jglick
Copy link
Author

jglick commented Mar 29, 2023

Ah, OK. So it should suffice to just retrigger indexing on Plugins. (Typically it also happens automatically on some schedule, but again I cannot even view the configuration here.)

@timja
Copy link
Member

timja commented Mar 29, 2023

I did that this morning and it failed I retriggered 30mins ago

@dduportal
Copy link
Contributor

@timja When you trigger such a compute-consuming task, can you first sync with the infra team before? It would help us anticipate things (I had to restart the controller this morning, I assume we crossed path without knowing)

@dduportal
Copy link
Contributor

(anyway, thanks for taking care of this folks, that will decrease the amount of non valuable builds 👍 )

@dduportal
Copy link
Contributor

Looks like the Plugins Org scanning is putting a lot of pressure to ci.jenkins.io 😅
Pagerduty just warned us about timeouts on ci.jenkins.io homepage. Tooks 3min to load here, and the build queue is full with 1600 elements and increasing 😱

@NotMyFault
Copy link
Member

Do we rebuild all PRs and branches of all repositories in jenkinsci atm?

@jglick
Copy link
Author

jglick commented Mar 29, 2023

Yeah, unfortunately there is this one-time trigger of PR-head builds since the configuration has changed. I am not sure offhand if it is possible to suppress this.

@dduportal
Copy link
Contributor

short term: increased the Azure VM capacity (since EC2 agents are not available until April): jenkins-infra/jenkins-infra#2726. Won't fix easily the issue, but should help.

I'm surprise by the amount of plugin builds requiring Docker.

@jglick
Copy link
Author

jglick commented Mar 29, 2023

You can try temporarily setting Branch names to build automatically to none (i.e., disable) and/or setting Suppression strategy to For matching branches suppress builds triggered by indexing (continue to honor webhooks) until the org scan is complete. I am not sure what happens when you revert the setting however.

@dduportal
Copy link
Contributor

Let's patiently wait for the queue to be treated 🤞 if it is ok for you.

@dduportal
Copy link
Contributor

(Added more capacity: increase the limit of "normal" Linux and Windows machines to 80 each)

The builds is decreasing slowly

@basil
Copy link
Collaborator

basil commented Mar 29, 2023

Yeah, unfortunately […] I am not sure offhand […] I am not sure what happens […]

@jglick Since others are now impacted with a build queue of length 1,618 due to the actions you have taken, should you become sure?

@basil
Copy link
Collaborator

basil commented Mar 29, 2023

I'm surprise by the amount of plugin builds requiring Docker.

@dduportal Why would you be surprised? When container agents were introduced, nobody migrated existing builds to use containers. So in the case where no action was taken, those builds are still running on VMs, whether they need to be (because they are using Docker) or not.

@lemeurherve
Copy link
Member

I'm surprise by the amount of plugin builds requiring Docker.

@dduportal Why would you be surprised? When container agents were introduced, nobody migrated existing builds to use containers. So in the case where no action was taken, those builds are still running on VMs, whether they need to be (because they are using Docker) or not.

About this specific point, I plan to do a batch action to set useContainerAgent to true (and maybe more default values like the jdk version) on each plugin specifying only buildPlugin() in their Jenkinsfile, then fixing those needing Docker. After that, we'll be able to change the default value of useContainerAgent to true in the pipeline library.

@dduportal
Copy link
Contributor

The instance ci.jenkinsis irresponsive. The build queue wasn't able to decrease since the past hour: it's fluctuate between 1550 and 1650 elements.

Sounds like we won't be able to catch up and we are considering purging the build queue. Any objection?

@jglick
Copy link
Author

jglick commented Mar 29, 2023

I can try to set up a demo environment of an org folder to look for ways to suppress builds at the PR-merge → PR-head transition. Due to the way GitHub is licensed this is not simple to automate, unfortunately.

@MarkEWaite
Copy link

No objection from me to purging the build queue. If we don't purge it, we don't get things done (as far as I can tell).

@dduportal
Copy link
Contributor

  • Queue triggered. it's responsive
  • Ran a "Reload from filesystem" to ensure that the build status is checked and syncrhonized with the in-memory state (some builds are marked as "aborted" while still being in the "Built-In Node" as if a pipeline step was still trying to run)

@dduportal
Copy link
Contributor

@dduportal
Copy link
Contributor

  • instance seems clearly better
  • @jglick @timja my understanding is that the change about "PR merge" is applied, but the scan did not finish (since the queue was cleaned).
    • Does that mean that the next scan would have the same behavior? I fear an automatic daily scan.
    • Is there any action to perform (that would re-trigger another scan wether option we choose)?

@jglick
Copy link
Author

jglick commented Mar 29, 2023

Does that mean that the next scan would have the same behavior?

Yes, unless build triggering can be suppressed.

Is there any action to perform

Sorry, I am not following the question.

@jglick
Copy link
Author

jglick commented Mar 29, 2023

I fear an automatic daily scan.

You can of course disable that for now (if it is currently enabled).

@dduportal
Copy link
Contributor

Disabled the (once a week) automatic Scan for the "Plugins" GH Org:

  • Set ci.j in "shutdown" mode
  • Disabled and saved the option => no scan triggerred
  • Removed the "shutdown" mode

@jglick
Copy link
Author

jglick commented Mar 29, 2023

Tried out an analogous scenario in GitHub Enterprise. (Using origin PRs, no forks.) Setting Suppression strategy to For matching branches suppress builds triggered by indexing (continue to honor webhooks) on the org folder does indeed prevent builds from being triggered by saving the org folder configuration after switching from PR-merge to PR-head. (Branch names to build automatically can be left at the default of .*.) Build Now on PR branch projects does subsequently run a PR-head build as expected, and View Configuration on the repository folders does show the updated strategy.

You can also subsequently reset Suppression strategy to the default (suppress nothing) on the org folder and save it without immediately triggering builds. However any subsequent branch indexing on a repository folder will trigger corresponding PR-head builds, so if Child Scan Triggers has these being triggered frequently, you may wish to unset this, or at least ensure that the interval is long enough that we can spread out the load so there is not a sudden build storm. We can simply leave indexing-triggered builds suppressed indefinitely, as this exists just to ensure that code changes eventually get built even if webhooks are lost, but usually developers will anyway push a new empty commit or merge in the base branch or something if they are actually waiting for a build.

@dduportal
Copy link
Contributor

Btw, discover the following setup:

Capture d’écran 2023-03-29 à 18 36 07

I wonder if the For matching branches suppress builds triggered by indexing (continue to honor webhooks) option should be selected as hinted by @jglick earlier?

@dduportal
Copy link
Contributor

Gotta try tomorrow morning (EU time) this new option then. Estimated time: 07:00 UTC.

@jglick
Copy link
Author

jglick commented Mar 29, 2023

(continue to honor webhooks)

Note: I did not attempt to configure a webhook in my demo controller, so I am just presuming that this option does what it says in that regard.

@dduportal
Copy link
Contributor

(continue to honor webhooks)

Note: I did not attempt to configure a webhook in my demo controller, so I am just presuming that this option does what it says in that regard.

Good grief, thanks for emphasing it.

I'm adding the "Open a PR on the jenkins-infra-test plugin to see if the build is caught and taken" check item for tomorrow's operation: would that be covering this element of doubt? (I'm asking to be sure we understand correctly)

@jglick
Copy link
Author

jglick commented Mar 29, 2023

Yes that would certainly be prudent. I can try configuring a webhook and checking that this still works given For matching branches suppress builds triggered by indexing (continue to honor webhooks).

@jglick
Copy link
Author

jglick commented Mar 29, 2023

Confirmed that webhooks are still processed and trigger (PR-head) builds as expected.

@dduportal
Copy link
Contributor

Starting the operation

@dduportal
Copy link
Contributor

  • Setting applied
  • Removed shutdown mode
  • No Scan triggered 👍

@dduportal
Copy link
Contributor

  • Tested: webhook are sent and work as expected 👍
  • Kept the 1 week scan repo of the organization job "Plugins" disable: webhooks will take care of adding/removing repositories and child are scanned once a day

@dduportal
Copy link
Contributor

I believe we can close this issue: feel free to reopen if anything has been missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants