Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Almost all the windows machines are DOWN #3435

Closed
1 task
UlisesGascon opened this issue Jul 30, 2023 · 6 comments
Closed
1 task

Almost all the windows machines are DOWN #3435

UlisesGascon opened this issue Jul 30, 2023 · 6 comments

Comments

@UlisesGascon
Copy link
Member

UlisesGascon commented Jul 30, 2023

Overview

This incident started the last Friday, seems like all the testing windows machines are currently down, in the release ci some windows machines are running.

Slack discussion

Release machines

Captura de pantalla 2023-07-30 a las 19 33 15

Test machines

Captura de pantalla 2023-07-30 a las 19 33 52
Captura de pantalla 2023-07-30 a las 19 34 54

Next steps

  • Reboot all the machines
@UlisesGascon
Copy link
Member Author

Seems like the root cause is a connectivity issue, I will check Azure and Rackspace

Captura de pantalla 2023-07-30 a las 19 50 38

@UlisesGascon UlisesGascon self-assigned this Jul 30, 2023
@UlisesGascon
Copy link
Member Author

I rebooted the machines in Azure and Rackspace. Overall, around 20 machines are back. Most of the down machuines currently are from Rackspace

Relese

Captura de pantalla 2023-07-30 a las 20 31 38

Testing

Captura de pantalla 2023-07-30 a las 20 30 09
Captura de pantalla 2023-07-30 a las 20 30 34

@nodejs/build: I will suggest to check more in detail the root cause and the logs from the machine to check by the connection with Jenkins is not properly stablished.

@UlisesGascon UlisesGascon removed their assignment Jul 30, 2023
@rluvaton
Copy link
Member

This may be related to recent Jenkins upgrade (Cc @targos )

@StefanStojanovic
Copy link
Contributor

A quick update: I continued what @UlisesGascon started and went through machines and got all of them except a few back online. Looks like the worst has passed, but this should be investigated so we know the root cause.

@UlisesGascon
Copy link
Member Author

Thanks a lot @StefanStojanovic!

@mhdawson
Copy link
Member

Discussed in the Build WG meeting, agreed to close. Will re-open if we get another similar outage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants