Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: Use schedulable masters if no compute hosts defined. #2004

Merged
merged 2 commits into from
Jul 30, 2019

Conversation

russellb
Copy link
Member

This change makes use of a new configuration item on the scheduler CR
that specifies that control plane hosts should be able to run
workloads. This option is off by default, but will now be turned on
if there are no compute machine pools with non-zero replicas defined.

This change also removes a validation and warning when no compute
hosts are defined, as an install with this configuration will now
complete successfully.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 16, 2019
@russellb russellb requested review from abhinavdahiya and eparis and removed request for vrutkovs and mtnbikenc July 16, 2019 20:26
@cgwalters
Copy link
Member

/approve
I have some uncertainty though if we want to support this across all platforms. I guess a 3 node AWS cluster e.g. is at least useful for cheaper testing.

Copy link
Member

@eparis eparis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where it should live but I would also like to see an info level alert configured, which informs you if this is true and you have worker nodes that security could be increased by unsetting it.

We should also fix the docs so UPI users stop explcitily setting replicas=0. They would, harmlessly default to 3 (and this code would not trigger), as I understand it if we left if unset.

pkg/asset/manifests/scheduler.go Outdated Show resolved Hide resolved
@abhinavdahiya
Copy link
Contributor

if there are no compute machine pools with non-zero replicas defined.

Compute zero is not a great indication of making the control-plane schedulable.

Cases where that's not true:

  1. When users want an UPI install, compute 0 is indication that no compute needs to be created by the installer.

  2. When users are using installer to provision the cluster with no RHCOS compute using the MachineSets.

The heuristic is too vague to turn-on a potential security hole for users by default.

@eparis
Copy link
Member

eparis commented Jul 16, 2019

There is not a known security hole. This is making the installer do what it should. Give users working clusters. And since this is a day2 tunable, they can then tune their cluster however they want. But day-1 you get something that works. That's the whole point of the installer.

@abhinavdahiya
Copy link
Contributor

There is not a known security hole. This is making the installer do what it should. Give users working clusters. And since this is a day2 tunable, they can then tune their cluster however they want. But day-1 you get something that works. That's the whole point of the installer.

But the user didn't ask the control-plane to be schedulable, only that they will take care of the compute.

And running workload on master is definitely a security hole (maybe not a CVE), master instances has elevated cloud API access.

@eparis
Copy link
Member

eparis commented Jul 16, 2019

Customer's didn't ask to keep their workloads away from masters. That's just something we did. And something we are trying to unroll as quickly as we can. This gives people working clusters. And with proper alerting no surprises.

@russellb
Copy link
Member Author

The primary alternative I considered was addiing an explicit config item under controlPlane for this in install-config. The major downside is expanding the install config. Maybe it makes some sense as it's related to ensuring a successful and complete install, though?

@ravisantoshgudimetla
Copy link
Contributor

Shouldn't this be specific to baremetal installation? It's true that on AWS and other cloud providers having workloads on master nodes is probably less ideal.

@russellb
Copy link
Member Author

Shouldn't this be specific to baremetal installation? It's true that on AWS and other cloud providers having workloads on master nodes is probably less ideal.

Yes, that's another option. We could make this baremetal platform specific. It doesn't really seem baremetal specific, though.

@eparis
Copy link
Member

eparis commented Jul 16, 2019

No. If a customer made a 3 node cluster out of m5.metal it should "just work". Same on large VMWare or OpenStack instances. A 3 node cluster should just work.

@ravisantoshgudimetla
Copy link
Contributor

It doesn't really seem baremetal specific, though.

Ya, I agree but that's the only use-case we have currently, isn't it? If someone says that it's needed in another cloud provider then perhaps we can think of having another option in install-config.

@abhinavdahiya
Copy link
Contributor

No. If a customer made a 3 node cluster out of m5.metal it should "just work". Same on large VMWare or OpenStack instances. A 3 node cluster should just work.

how does your comment work with User asked for no compute, I should make masters scheduable

and how do you differentiate them from users (like UPI) who didn't want their masters to be scheduable but only wanted to create their own compute nodes.

@ravisantoshgudimetla
Copy link
Contributor

No. If a customer made a 3 node cluster out of m5.metal it should "just work". Same on large VMWare or OpenStack instances. A 3 node cluster should just work.

I think everyone agrees, if the cluster size is 3 nodes(during and after the install). The ambiguity comes in the situations where installer is not sure of user's intention, if they're trying to do UPI or want to add workers later.

@eparis
Copy link
Member

eparis commented Jul 16, 2019

And people who add workers after will get an alert and can then change the config. But they get a working cluster. Which is the goal.

@abhinavdahiya
Copy link
Contributor

And people who add workers after will get an alert and can then change the config. But they get a working cluster. Which is the goal.

What alert are we talking about, INFO that masters will be scheduled... That's not an alert.

But they get a working cluster. Which is the goal.

But a working cluster at the expense of making their control-plane scheduled, when they were completely ready/capable of bringing up the compute nodes.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 17, 2019
@eparis
Copy link
Member

eparis commented Jul 17, 2019

I've asked #2004 (review) to make an alert, not just an install time message. I expect it'll live with the scheduling operator, since that's where the tunable lives.

The point of the installer is to give people a working cluster. We fully support 3 node clusters and support people running workloads on masters. That is not an expense. It is the system doing what it should.

@crawford @smarterclayton we already talked this through and both of you agreed. Would you mind commenting here as well?

@russellb
Copy link
Member Author

I've posted an alternative for discussion here: #2024

The benefit of #2024 is providing a way for the intent to be much more clear

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 30, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, crawford, russellb, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 208a3be into openshift:master Jul 30, 2019
allErrs = append(allErrs, ValidateMachinePool(platform, &p, poolFldPath)...)
}
if !foundPositiveReplicas {
logrus.Warnf("There are no compute nodes specified. The cluster will not fully initialize without compute nodes.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we really remove this warning? Does this PR allow the ingress scheduler to be scheduled on control plane nodes (e.g. see here and kubernetes/kubernetes#65618)? Are we running CI on zero-compute clusters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when you set schedulableMasters to true, the worker role gets added to the master nodes, as well.

nothing in OpenShift CI for this yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On cloud platforms, while adding a worker role to masters will allow ingress controllers to be scheduled on the masters, the master nodes will remain excluded from load balancer target pools, breaking the practical feature.

As to whether Kube will support including masters in LB target pools, the jury is still out and there is still no KEP AFAICT.

kubernetes/kubernetes#65618
https://groups.google.com/d/msg/kubernetes-sig-network/erULVdDkLNw/hvCukZ3CBQAJ

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@russelb, can you restore the warning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, with a single compute, we won't trigger this schedulable-control-plane logic, so won't ingress hang with "I can't get replicas up to two"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning wouldn't be appropriate for a 3-node bare metal cluster. I guess it could be restored and only emitted if the platform type is not baremetal or none because of the load balancer issue mentioned by @ironcladlou ?

russellb added a commit to russellb/dev-scripts that referenced this pull request Jul 31, 2019
Prior to this change, our install config listed a number of workers.
In practice though, we don't support deploying workers at install
time.  We require scripts post-install to set up the
baremetal-operator before workers can be deployed.  This change makes
it so we expliclty install a 3 node cluster, and then scale it out as
a post-install step if dev-scripts was configured to do so.

A side effect of this change is that the installer will automatically
adjust the scheduler configuration to make master nodes schedulable.
For more details on that change, see:
openshift/installer#2004

This also means that we can drop the custom IngressController
manifest.  When the cluster is configured with "mastersSchedulable:
true" in the scheduler config, the master Nodes will have both the
master and worker roles set on them, making them candidates for
running the default Ingress controller, without any changes.

Closes issue openshift-metal3#705
russellb added a commit to russellb/dev-scripts that referenced this pull request Jul 31, 2019
Prior to this change, our install config listed a number of workers.
In practice though, we don't support deploying workers at install
time.  We require scripts post-install to set up the
baremetal-operator before workers can be deployed.  This change makes
it so we expliclty install a 3 node cluster, and then scale it out as
a post-install step if dev-scripts was configured to do so.

A side effect of this change is that the installer will automatically
adjust the scheduler configuration to make master nodes schedulable.
For more details on that change, see:
openshift/installer#2004

This also means that we can drop the custom IngressController
manifest.  When the cluster is configured with "mastersSchedulable:
true" in the scheduler config, the master Nodes will have both the
master and worker roles set on them, making them candidates for
running the default Ingress controller, without any changes.

Closes issue openshift-metal3#705
hardys pushed a commit to hardys/dev-scripts that referenced this pull request Aug 13, 2019
Prior to this change, our install config listed a number of workers.
In practice though, we don't support deploying workers at install
time.  We require scripts post-install to set up the
baremetal-operator before workers can be deployed.  This change makes
it so we expliclty install a 3 node cluster, and then scale it out as
a post-install step if dev-scripts was configured to do so.

A side effect of this change is that the installer will automatically
adjust the scheduler configuration to make master nodes schedulable.
For more details on that change, see:
openshift/installer#2004

This also means that we can drop the custom IngressController
manifest.  When the cluster is configured with "mastersSchedulable:
true" in the scheduler config, the master Nodes will have both the
master and worker roles set on them, making them candidates for
running the default Ingress controller, without any changes.

Closes issue openshift-metal3#705
russellb added a commit to russellb/dev-scripts that referenced this pull request Aug 19, 2019
This is no longer required, as we install a 3 node cluster by default.
The installer will automatically adjust the scheduler configuration to
make master nodes schedulable. For more details on that change, see:
openshift/installer#2004

Closes issue openshift-metal3#705
@wking
Copy link
Member

wking commented Sep 24, 2019

And since this is a day2 tunable...

Has anyone worked out the commands to run to tune this on day-2? I didn't see them rereading the PR thread.

wking added a commit to wking/openshift-installer that referenced this pull request Sep 24, 2019
We grew this in c22d042 (docs/user/aws/install_upi: Add 'sed' call
to zero compute replicas, 2019-05-02, openshift#1649) to set the stage for
changing the 'replicas: 0' semantics from "we'll make you some dummy
MachineSets" to "we won't make you MachineSets".  But that hasn't
happened yet, and since 64f96df (scheduler: Use schedulable masters
if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0' for
compute has also meant "add the 'worker' role to control-plane nodes".
That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], it's easier to just drop this
section and keep the 'worker' role off the control-plane machines
entirely.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
@russellb
Copy link
Member Author

And since this is a day2 tunable...

Has anyone worked out the commands to run to tune this on day-2? I didn't see them rereading the PR thread.

oc edit scheduler cluster and change mastersSchedulable to true or false

wking added a commit to wking/openshift-installer that referenced this pull request Oct 1, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
wking added a commit to wking/openshift-installer that referenced this pull request Oct 2, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
wking added a commit to wking/openshift-installer that referenced this pull request Oct 2, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
russellb added a commit to russellb/machine-config-operator that referenced this pull request Oct 16, 2019
The purpose of this change was to make masters able to run workloads
by default.  This is needed to complete a successful deployment of a
3-node bare metal install.  This particular approach was only short
term, while better interfaces were developed to control this behavior.

The scheduler configuration resource now includes a
"mastersSchedulable" boolean, enabled here:

openshift#937

This installer PR made it the default behavior if no workers were
defined at install time:

openshift/installer#2004

With these changes in place, the custom kubelet config for the
baremetal platform is no longer necessary.
alaypatel07 pushed a commit to alaypatel07/installer that referenced this pull request Nov 13, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
jhixson74 pushed a commit to jhixson74/installer that referenced this pull request Dec 6, 2019
We grew replicas-zeroing in c22d042 (docs/user/aws/install_upi: Add
'sed' call to zero compute replicas, 2019-05-02, openshift#1649) to set the
stage for changing the 'replicas: 0' semantics from "we'll make you
some dummy MachineSets" to "we won't make you MachineSets".  But that
hasn't happened yet, and since 64f96df (scheduler: Use schedulable
masters if no compute hosts defined, 2019-07-16, openshift#2004) 'replicas: 0'
for compute has also meant "add the 'worker' role to control-plane
nodes".  That leads to racy problems when ingress comes through a load
balancer, because Kubernetes load balancers exclude control-plane
nodes from their target set [1,2] (although this may get relaxed
soonish [3]).  If the router pods get scheduled on the control plane
machines due to the 'worker' role, they are not reachable from the
load balancer and ingress routing breaks [4].  Seth says:

> pod nodeSelectors are not like taints/tolerations.  They only have
> effect at scheduling time.  They are not continually enforced.

which means that attempting to address this issue as a day-2 operation
would mean removing the 'worker' role from the control-plane nodes and
then manually evicting the router pods to force rescheduling.  So
until we get the changes from [3], we can either drop the zeroing [5]
or adjust the scheduler configuration to remove the effect of the
zeroing.  In both cases, this is a change we'll want to revert later
once we bump Kubernetes to pick up a fix for the service load-balancer
targets.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1671136#c1
[2]: kubernetes/kubernetes#65618
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1744370#c6
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1755073
[5]: openshift#2402
zaneb added a commit to zaneb/assisted-service that referenced this pull request Sep 19, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
openshift-merge-robot pushed a commit to openshift/assisted-service that referenced this pull request Sep 20, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/assisted-service that referenced this pull request Sep 28, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
openshift-merge-robot pushed a commit to openshift/assisted-service that referenced this pull request Sep 29, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.

Co-authored-by: Zane Bitter <zbitter@redhat.com>
danielerez pushed a commit to danielerez/assisted-service that referenced this pull request Oct 15, 2023
…ift#4414)

When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.