Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All maven-11 agents are unavailable #3096

Closed
NotMyFault opened this issue Aug 11, 2022 · 5 comments
Closed

All maven-11 agents are unavailable #3096

NotMyFault opened this issue Aug 11, 2022 · 5 comments

Comments

@NotMyFault
Copy link
Member

NotMyFault commented Aug 11, 2022

Service(s)

ci.jenkins.io

Summary

Hey,

it appears that all maven-11 agents are either offline or unresponsive? The check-agent-availability job doesn't report anything since noon anymore, but jobs bound to maven-11 agents, like the RPU, are stuck in queue for almost 2h: https://ci.jenkins.io/job/Infra/job/repository-permissions-updater/job/PR-2714/1/console.

For reference, highmem nodes and windows nodes execute builds fine.

Reproduction steps

No response

@NotMyFault NotMyFault added the triage Incoming issues that need review label Aug 11, 2022
@dduportal dduportal added this to the infra-team-sync-2022-08-16 milestone Aug 11, 2022
@dduportal dduportal removed the triage Incoming issues that need review label Aug 11, 2022
@dduportal dduportal self-assigned this Aug 11, 2022
@dduportal
Copy link
Contributor

Thanks @NotMyFault . could be a side-effect of #3090 (comment): I'm checking.

@dduportal
Copy link
Contributor

I confirm that I see the maven-11 pods being started and failing immediatly.

  • Disabling puppet on ci.jenkins.io
  • Enabling retention for maven-11 pods on cik8s

=> now debugging.

@dduportal
Copy link
Contributor

This is definitively a side effect of #3090.

The error is /usr/local/bin/jenkins-agent: 121: exec: /opt/jdk-11/bin/java: not found.

Applying a hotfix to handle the build queue

@dduportal
Copy link
Contributor

Had to monkey-patch the JCasc YAML on the ci.jenkins.io to be safe until tomorrow (that should have a fix):

  • removed the JENKINS_JAVA_BIN variable for the maven-11 container templates (cik8s, doks and windows aci)
  • Disabled + stopped puppet with a link to this issue
  • Reloaded JCasc

=> builds are flowing now.

Thanks @NotMyFault for raising the issue

dduportal added a commit to dduportal/jenkins-infra that referenced this issue Aug 12, 2022
…lt java installation.

Related to jenkins-infra/helpdesk#3096

Signed-off-by: Damien Duportal <damien.duportal@gmail.com>
dduportal added a commit to jenkins-infra/jenkins-infra that referenced this issue Aug 12, 2022
…lt java installation. (#2322)

Related to jenkins-infra/helpdesk#3096

Signed-off-by: Damien Duportal <damien.duportal@gmail.com>

Signed-off-by: Damien Duportal <damien.duportal@gmail.com>
@dduportal
Copy link
Contributor

  • Configuration persisted on ci.jenkins.io as code (see attached PRs)
  • Tested both Linux and Windows containers

=> looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants