-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jenkins is unable to keep the slaves online after they are started #621
Comments
same issue here, after updating from 0.16.2 to 1.1.3. In last 24 hours.... almost 10K containers too promptly killed. Containers usually start after a few minutes but consequence is jobs end in build queue for too long. |
I think that #623 may be related here. I've also noticed (during my own testing) that |
Thanks @pjdarton You hint helped me a lot! After the increasing of the idleTimeout from 0 to 1, the containers started again normally. Another question: I'm still not quite sure is there a way to limit the number of executor of one docker slave node? Currently, my feeling is that only one job can be executed within one started container, that is also my goal. But is this really true? |
One job per container is standard behaviour, yes. |
The problem is that the containers are killed shortly after they were started and the jobs are sticking in the build queue. Furthermore, after the update from 1.1.2 to 1.1.3 all the "killed" container configurations putted into the ${JENKINS_HOME}/config-history/nodes/ folder as already deleted nodes e.g.
This 822 deleted nodes were created only for 3 hours...
Here also this part of the log from Jenkins...
I'm not really sure where the problem is, but after a couple of retries (don't know how many, but really many, many retries) a container is started and the job is executed. Unfortunately this takes a lot of time and leaves a lot of "deleted" slaves...I would really appreciate any help.
The text was updated successfully, but these errors were encountered: