Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1755073: docs/user/*/install_upi: Drop compute replicas zeroing #2402

Closed
wants to merge 1 commit into from

Conversation

wking
Copy link
Member

@wking wking commented Sep 24, 2019

We grew this in c22d042 (#1649) to set the stage for changing the replicas: 0 semantics from "we'll make you some dummy MachineSets" to "we won't make you MachineSets". But that hasn't happened yet, and since 64f96df (#2004) replicas: 0 for compute has also meant "add the worker role to control-plane nodes". That leads to racy problems when ingress comes through a load balancer, because Kubernetes load balancers exclude control-plane nodes from their target set (see here and kubernetes/kubernetes#65618, although this may get relaxed soonish). If the router pods get scheduled on the control plane machines due to the worker role, they are not reachable from the load balancer and ingress routing breaks. @sjenning says:

pod nodeSelectors are not like taints/tolerations. They only have effect at scheduling time. They are not continually enforced.

which means that attempting to address this issue as a day-2 operation would mean removing the worker role from the control-plane nodes and then manually evicting the router pods to force rescheduling. So until we get the changes from here, it's easier to just drop this section and keep the worker role off the control-plane machines entirely.

We grew this in c22d042 (docs/user/aws/install_upi: Add 'sed' call
to zero compute replicas, 2019-05-02, openshift#1649) to set the stage for
changing the 'replicas: 0' semantics from "we'll make you some dummy
MachineSets" to "we won't make you MachineSets".  But that hasn't
happened yet, and since 64f96df (scheduler: Use schedulable masters
if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0' for
compute has also meant "add the 'worker' role to control-plane nodes".
That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], it's easier to just drop this
section and keep the 'worker' role off the control-plane machines
entirely.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
@openshift-ci-robot
Copy link
Contributor

@wking: This pull request references Bugzilla bug 1755073, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1755073: docs/user/*/install_upi: Drop compute replicas zeroing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 24, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 24, 2019
@wking
Copy link
Member Author

wking commented Sep 24, 2019

I dunno what to do about this CI code vs. this PR. Are we ok setting replicas: 2 in our install-config.yaml on all versions, or do we need to have version-specific CI logic to run our 4.1.z recommendations on 4.1.z code, 4.2.z recommendations on 4.2.z code, etc.?

@wking
Copy link
Member Author

wking commented Sep 24, 2019

@kalexand-rh: if this lands, we'll want to update openshift-docs for 4.2 as well.

@abhinavdahiya
Copy link
Contributor

I don't think this is the right path forward for BZ 1755073, i would rather see we document the extra step of modifying the the file manifests/cluster-scheduler-02-config.yml .spec.mastersSchedulable to false on AWS and GCP UPI docs.. as we can't have control-plane running all workload.

@openshift-ci-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-scaleup-rhel7 cb31b68 link /test e2e-aws-scaleup-rhel7
ci/prow/e2e-aws-upgrade cb31b68 link /test e2e-aws-upgrade

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

wking added a commit to wking/openshift-installer that referenced this pull request Oct 1, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
@wking
Copy link
Member Author

wking commented Oct 1, 2019

i would rather see we document the extra step of modifying the the file manifests/cluster-scheduler-02-config.yml .spec.mastersSchedulable to false on AWS and GCP UPI docs.. as we can't have control-plane running all workload.

I've filed #2440 with that approach, but I prefer this one because:

Still, either way should work for 4.2, so land whichever.

@sdodson
Copy link
Member

sdodson commented Oct 2, 2019

/close
We've gone with #2440

@openshift-ci-robot
Copy link
Contributor

@sdodson: Closed this PR.

In response to this:

/close
We've gone with #2440

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking added a commit to wking/openshift-installer that referenced this pull request Oct 2, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
wking added a commit to wking/openshift-installer that referenced this pull request Oct 2, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
alaypatel07 pushed a commit to alaypatel07/installer that referenced this pull request Nov 13, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
jhixson74 pushed a commit to jhixson74/installer that referenced this pull request Dec 6, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants