-
Notifications
You must be signed in to change notification settings - Fork 39.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduler can recieve its policy configuration from a ConfigMap #43892
Conversation
@kubernetes/sig-scheduling-pr-reviews |
Looks like I need to manually add some conversion functions. I will add them soon. |
054a7a1
to
9c16267
Compare
@@ -63,7 +63,9 @@ func (s *SchedulerServer) AddFlags(fs *pflag.FlagSet) { | |||
fs.Int32Var(&s.Port, "port", s.Port, "The port that the scheduler's http service runs on") | |||
fs.StringVar(&s.Address, "address", s.Address, "The IP address to serve on (set to 0.0.0.0 for all interfaces)") | |||
fs.StringVar(&s.AlgorithmProvider, "algorithm-provider", s.AlgorithmProvider, "The scheduling algorithm provider to use, one of: "+factory.ListAlgorithmProviders()) | |||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration") | |||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration. This file is used if policy ConfigMap object does not exists or scheduler is using legacy policy config.") | |||
fs.StringVar(&s.PolicyConfigMapName, "policy-configmap", s.PolicyConfigMapName, "Name of the ConfigMap object that contains scheduler's policy configuration. It must exist in the system namespace before scheduler initialization.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/It must exist in the system namespace before scheduler initialization./It must exist in the system namespace before scheduler initialization if scheduler is not using legacy policy config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration") | ||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration. This file is used if policy ConfigMap object does not exists or scheduler is using legacy policy config.") | ||
fs.StringVar(&s.PolicyConfigMapName, "policy-configmap", s.PolicyConfigMapName, "Name of the ConfigMap object that contains scheduler's policy configuration. It must exist in the system namespace before scheduler initialization.") | ||
fs.BoolVar(&s.UseLegacyPolicyConfig, "use-legacy-policy-config", false, "When set to true, scheduler will ignore PolicyConfigMapName and uses PolicyConfigFile") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the PolicyConfigMapName
and PolicyConfigFile
are some internal used words, how about s/PolicyConfigMapName and uses PolicyConfigFile/policy config map and uses policy config file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Done.
pkg/apis/componentconfig/types.go
Outdated
// uses PolicyConfigFile. If UseLegacyPolicyConfig is false and | ||
// PolicyConfigMapName is not empty, the ConfigMap object with this name must | ||
// exist before scheduler initialization. | ||
PolicyConfigMapName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If our component config reduces to a config map, should all other params shuffle to config map and be removed from the component config?
Also we need a means to generate a config map with all the defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also do we need to know it's a configmap? Isn't it just loading a file and the config map is an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timothysc This PR is not going to make any changes to scheduler's component config. This is only about scheduler's policy config. Policy config used to be a file whose path is provided in the component config. With this PR we support the file and also a ConfigMap to configure scheduler's policy.
This PR implements part of "Approach 2" as described in:
https://docs.google.com/document/d/19AKH6V6ejOeIvyGtIPNvRMR4Yi_X8U3Q1zz2fgTNhvM
This is super important we get this right, and @k8s-mirror-cluster-lifecycle-misc will need to be involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some chicken-egg questions that need to be resolved 1st.
return nil, fmt.Errorf("ConfigMap %v has %v entries in its 'Data'. It must have only one.", len(policyConfigMap.Data), sc.policyConfigMap) | ||
} | ||
policyConfigMapFound = true | ||
// This loop should iterate only once, as we have already checked the length of Data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just get data directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key
can be any arbitrary string and we don't know it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes. never mind :).
stop := make(chan struct{}) | ||
informerFactory.Start(stop) | ||
|
||
defer close(stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
informerFactory.Start(stop) | ||
|
||
sched.Run() | ||
defer close(stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
informerFactory.Start(stop) | ||
|
||
sched.Run() | ||
defer close(stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer just after "stop".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any style guide or recommendation on this? Other tests in the file use this pattern and I think consistency is more important, unless there is a strong reason to use a different pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the style is from golang, prefer to call "defer" close to where the object was build, e.g. for "sync.Mutex", we'll call
l.Lock()
defer l.Unlock()
// other codes.
For the consistency, I'd like to correct other lines in this file :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, which line do you mean "Other tests in the file use this pattern"? Here's an example about defer
.
56 func TestUnschedulableNodes(t *testing.T) {
57 _, s := framework.RunAMaster(nil)
58 defer s.Close()
59
60 ns := framework.CreateTestingNamespace("unschedulable-nodes", s, t)
61 defer framework.DeleteTestingNamespace(ns, s, t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, I didn't understand what you meant by your first comment. Done.
sched.Run() | ||
defer close(stop) | ||
|
||
DoTestUnschedulableNodes(t, clientSet, ns, informerFactory.Core().V1().Nodes().Lister()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This verification can not make sure the config from ConfigMap was took place; default config can also pass this test, right?
Similar to the following test, whether return err
can not make sure the config was updated as expected; I'd like suggest to find a way to check the data/config with the scheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduler does not expose its internal configuration for testing. I could change things in scheduler to make it testable, but but I am not changing any logic around applying the config to the scheduler in this PR. That's why I thought those tests may not belong to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably fine to do in a different PR, but I think as part of this overall feature it's important to test that the values read from the ConfigMap were actually applied by the scheduler. (I realize we had no useful testing for the file case previously.) I think you were mentioning the other day that the factory test could be adapted to do this? Or something simple like what @k82cn suggested in the next comment (just tweak one policy and see that it is respected in how a pod is scheduled).
} | ||
policyConfigMapFound = true | ||
// This loop should iterate only once, as we have already checked the length of Data. | ||
for _, val := range policyConfigMap.Data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to get data directly and remove the comments :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my previous answer.
informerFactory.Start(stop) | ||
|
||
sched.Run() | ||
defer close(stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer just after "stop:=..."
sched.Run() | ||
defer close(stop) | ||
|
||
DoTestUnschedulableNodes(t, clientSet, ns, informerFactory.Core().V1().Nodes().Lister()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The verification can not check whether the config from ConfigMap was set, as the default config can also pass it.
Similar comments to the following cases, the err
from app.CreateScheduler
can not make sure the config was set in sched
; prefer to check sched
's data directly or only enable part of predicates
/priorities
to check the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the same point as above. I will try to add a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, similar point; used to close and re-open browser, the previous comments was "lost" but it's here when I submit new comments :(. Sorry for the confuse.
The sched's did not expose data to check; but I think we can only enable part of predicates/priorities
to check, e.g. disable "PortFit" predicates and dispatch two pods with same port to the same host , the second one should NOT be Unscheduable
. But I think you will get a better test than that :).
stop := make(chan struct{}) | ||
informerFactory.Start(stop) | ||
|
||
defer close(stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
// PolicyConfigMapName is the name of the ConfigMap object that specifies | ||
// the scheduler's policy config. If UseLegacyPolicyConfig is true, scheduler | ||
// uses PolicyConfigFile. If UseLegacyPolicyConfig is false and | ||
// PolicyConfigMapName is not empty, the ConfigMap object with this name must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
7d7c7c6
to
66ba4c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only skimmed it, looks good overall, just a few comments.
configData, err := ioutil.ReadFile(sc.policyFile) | ||
if err != nil { | ||
return nil, fmt.Errorf("unable to read policy config: %v", err) | ||
// If there we are in legacy mode or ConfigMap name is empty, try to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/there we/we/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
// Create implements the interface for the Configurator, hence it is exported | ||
// even through the struct is not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/through/though/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -63,7 +63,9 @@ func (s *SchedulerServer) AddFlags(fs *pflag.FlagSet) { | |||
fs.Int32Var(&s.Port, "port", s.Port, "The port that the scheduler's http service runs on") | |||
fs.StringVar(&s.Address, "address", s.Address, "The IP address to serve on (set to 0.0.0.0 for all interfaces)") | |||
fs.StringVar(&s.AlgorithmProvider, "algorithm-provider", s.AlgorithmProvider, "The scheduling algorithm provider to use, one of: "+factory.ListAlgorithmProviders()) | |||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration") | |||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration. This file is used if policy ConfigMap is not provided or scheduler is using legacy policy config.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "or --use-legacy-policy-config==true" instead of "or scheduler is using legacy policy config"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -63,7 +63,9 @@ func (s *SchedulerServer) AddFlags(fs *pflag.FlagSet) { | |||
fs.Int32Var(&s.Port, "port", s.Port, "The port that the scheduler's http service runs on") | |||
fs.StringVar(&s.Address, "address", s.Address, "The IP address to serve on (set to 0.0.0.0 for all interfaces)") | |||
fs.StringVar(&s.AlgorithmProvider, "algorithm-provider", s.AlgorithmProvider, "The scheduling algorithm provider to use, one of: "+factory.ListAlgorithmProviders()) | |||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration") | |||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration. This file is used if policy ConfigMap is not provided or scheduler is using legacy policy config.") | |||
fs.StringVar(&s.PolicyConfigMapName, "policy-configmap", s.PolicyConfigMapName, "Name of the ConfigMap object that contains scheduler's policy configuration. It must exist in the system namespace before scheduler initialization if scheduler is not using legacy policy config.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "if --use-legacy-policy-config==false" instead of "if scheduler is not using legacy policy config"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sched.Run() | ||
defer close(stop) | ||
|
||
DoTestUnschedulableNodes(t, clientSet, ns, informerFactory.Core().V1().Nodes().Lister()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably fine to do in a different PR, but I think as part of this overall feature it's important to test that the values read from the ConfigMap were actually applied by the scheduler. (I realize we had no useful testing for the file case previously.) I think you were mentioning the other day that the factory test could be adapted to do this? Or something simple like what @k82cn suggested in the next comment (just tweak one policy and see that it is respected in how a pod is scheduled).
pkg/apis/componentconfig/types.go
Outdated
// the scheduler's policy config. If UseLegacyPolicyConfig is true, scheduler | ||
// uses PolicyConfigFile. If UseLegacyPolicyConfig is false and | ||
// PolicyConfigMapName is not empty, the ConfigMap object with this name must | ||
// exist in the default system namespace ("kube-system") before scheduler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given multi-schedulers and #42961, I think we should probably allow for the flexibility of specifying another namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@bsalamat I think maybe you should update the rbac bootstrap policy for the scheduler as you involve new resource type to access. |
var policy schedulerapi.Policy | ||
|
||
namespace := sc.policyConfigMapNamespace | ||
if namespace == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to add to the defaults.go ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Done.
} | ||
// If not in legacy mode, try to find policy ConfigMap. | ||
if !sc.useLegacyPolicyConfig && len(sc.policyConfigMap) != 0 { | ||
policyConfigMap, err := sc.GetClient().CoreV1().ConfigMaps(namespace).Get(sc.policyConfigMap, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have default policyConfigMapName ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't. If it is empty, we ignore it.
@wanghaoran1988 is the rbac permissions needed to let scheduler access its ConfigMap? |
lgtm to me now, if someone else wants to once over. |
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration") | ||
fs.StringVar(&s.PolicyConfigFile, "policy-config-file", s.PolicyConfigFile, "File with scheduler policy configuration. This file is used if policy ConfigMap is not provided or --use-legacy-policy-config==true") | ||
fs.StringVar(&s.PolicyConfigMapName, "policy-configmap", s.PolicyConfigMapName, "Name of the ConfigMap object that contains scheduler's policy configuration. It must exist in the system namespace before scheduler initialization if --use-legacy-policy-config==false") | ||
fs.StringVar(&s.PolicyConfigMapNamespace, "policy-configmap-namespace", s.PolicyConfigMapNamespace, "The namespace where policy ConfigMap is located. The system namespace will be used if this is not provided or is empty.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value should be "kube-system"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, we do not need that if we have that set in the defaults.go @k82cn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the default.go
will execute? AFAIK, the default will be set by apiserver; not sure when the componentconfig
is update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
En, it's set in NewSchedulerServer
manually. Also prefer to the set the default value to kube-system
so the help message [default=kube-system]
will be included.
Anyway, not a block comments.
@@ -41,4 +41,10 @@ type SchedulerExtender interface { | |||
// onto machines. | |||
type ScheduleAlgorithm interface { | |||
Schedule(*v1.Pod, NodeLister) (selectedMachine string, err error) | |||
// Predicates() returns a pointer to a map of predicate functions. This is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d/a pointer to/
or
s/pointer/reference/
LGTM, minor comments. |
@liggitt @jayunit100 I am not sure whether openshift will enable user to use configMap change the policy config, but I know that openshift are try to reuse the controller rolebindings, if so, we need update the bootstrap policy for scheduler controller, and I prefer we have an default policy-configmap if we enabled them by default. |
This is a very similar discussion to the one kube-dns had, which wanted to make use of dynamic config, and started by making the pod watch a config map. That was modified to allow reading config from a file instead, which allowed mounting the config map holding the config into the pod. This meant no API entanglement or permissions needed for purposes of reading the config, and no modifications to the dns pod whether it was reading from a static file or one mounted from a config map. |
/lgtm |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bsalamat, k82cn
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue |
I would expect these comments to be fixed - is there an issue for that?
…On Tue, Apr 11, 2017 at 10:41 PM, Jordan Liggitt ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In plugin/cmd/kube-scheduler/app/configurator.go
<#43892 (comment)>
:
> +func (sc schedulerConfigurator) getSchedulerPolicyConfig() (*schedulerapi.Policy, error) {
+ var configData []byte
+ var policyConfigMapFound bool
+ var policy schedulerapi.Policy
+
+ // If not in legacy mode, try to find policy ConfigMap.
+ if !sc.useLegacyPolicyConfig && len(sc.policyConfigMap) != 0 {
+ namespace := sc.policyConfigMapNamespace
+ policyConfigMap, err := sc.GetClient().CoreV1().ConfigMaps(namespace).Get(sc.policyConfigMap, metav1.GetOptions{})
+ if err != nil {
+ return nil, fmt.Errorf("Error getting scheduler policy ConfigMap: %v.", err)
+ }
+ if policyConfigMap != nil {
+ // We expect the first element in the Data member of the ConfigMap to
+ // contain the policy config.
+ if len(policyConfigMap.Data) != 1 {
the scheduler should either look for a fixed key or be able to be told
which key to use
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#43892 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p9YR9BBzSps-l6qLiJ6L1n339Qquks5rvDm-gaJpZM4MvS1C>
.
|
|
||
// TestSchedulerCreationInLegacyMode ensures that creation of the scheduler | ||
// works fine when legacy mode is enabled. | ||
func TestSchedulerCreationInLegacyMode(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing test failures on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ref #41600 |
What this PR does / why we need it: This PR adds the ability to scheduler to receive its policy configuration from a ConfigMap. Before this, scheduler could receive its policy config only from a file. The logic to watch the ConfigMap object will be added in a subsequent PR.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Special notes for your reviewer:
Release note: