Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Jenkins is down #5061

Closed
hcho3 opened this issue Nov 23, 2019 · 8 comments
Closed

[CI] Jenkins is down #5061

hcho3 opened this issue Nov 23, 2019 · 8 comments

Comments

@hcho3
Copy link
Collaborator

hcho3 commented Nov 23, 2019

The Jenkins master node (xgboost-ci.net) has been down (*) for a few days, according to @trivialfis. Impact: Substantial amount of test suite is rendered unavailable. I'm re-building the master from scratch. Sorry for inconvenience caused. @dmlc/xgboost-committer

PS. This seems to be a good opportunity to document all steps it takes to set up the Jenkins master node from scratch. See hcho3/xgboost-devops#6. Also #4958

(*) I cannot either ping or SSH into the master instance.

@trivialfis
Copy link
Member

Travis seems also down as there is a permission error. Will try to resolve it later.

@trivialfis
Copy link
Member

trivialfis commented Nov 24, 2019

Travis is fixed in #5062 . The disabled test should be related to #5063 . As our previous discussion on the threaded iterator, I will make the fix as a future plan.

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 24, 2019

@trivialfis Thanks for fixing Travis! I'll get Jenkins back up very soon.

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 26, 2019

Update: Linux pipeline is now back up. I still need to work on the Windows pipeline.

@trivialfis
Copy link
Member

trivialfis commented Nov 28, 2019

Okay. I messed up a little. Out of curious I created a different pipeline with Script Path pointing to Jenkinsfile-win64, but somehow it's still running the pipeline defined in Jenkinsfile. I deleted the new pipeline as I failed to find a solution for getting it work. Also I added a worker from Windows CPU AMI which is later deleted.

@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 28, 2019

@trivialfis I updated hcho3/xgboost-devops#6 to include some details about the Windows pipeline. I'm also closing this issue, as Jenkins is now fully back up.

@hcho3 hcho3 closed this as completed Nov 28, 2019
@hcho3 hcho3 reopened this Nov 29, 2019
@hcho3
Copy link
Collaborator Author

hcho3 commented Nov 29, 2019

Sorry, looks like I called it too early. Windows pipeline is still broken. It looks like I have to re-build AMIs for all Windows workers. (There are 5 AMIs)

@hcho3
Copy link
Collaborator Author

hcho3 commented Dec 2, 2019

Fixed in #5078 and hcho3/xgboost-devops#8

@lock lock bot locked as resolved and limited conversation to collaborators Mar 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants