-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"failed to propose on members [https://127.0.0.1:24001]" #6447
Comments
Another occurrence here: https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/8043/consoleText
|
flake here:
|
Another
https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_origin/4448/consoleFull |
When I run ./hack/test-end-to-end.sh with latest origin, I am seeing similar issue: [INFO] Running a CLI command in a container using the service account [FAIL] !!!!! Test Failed !!!! |
It looks like this is being caused by sudden latency in disk IO. See #6542 (comment) for details. Since this seems to be an environmental problem, we're currently working around this problem by using a ramdisk. This frees the merge and test queue for now. |
Something's back
|
@smarterclayton happened after the CI flow changed. I wonder if its exceeding the pre-allocated space now. Since I'm rebasing and @liggitt already tried, you want a go? |
Jordan's looking at it but it's very likely the builds are moved On Mon, Feb 1, 2016 at 8:03 AM, David Eads notifications@github.com wrote:
|
I think this is resolved now. |
Master startup can fail when ec2 transparently reallocates the block storage, causing etcd writes to temporarily fail. Retry failures blindly just once to allow time for this transient condition to to resolve and for systemd to restart the master (which will eventually succeed). etcd-io/etcd#3864 openshift/origin#6065 openshift/origin#6447
This bug happens because the etcd server can successfully write, but the sync to the wal can be super slow. That results in the etcd server replying with a 500. The etcd client then retries the call automatically, which fails because the action was already taken.
This can manifest as:
https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/8038/consoleText
See #6065 for more details.
The text was updated successfully, but these errors were encountered: