-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libcontainer: intelrdt: add update command support #1590
libcontainer: intelrdt: add update command support #1590
Conversation
update.go
Outdated
@@ -254,6 +259,18 @@ other options are ignored. | |||
config.Cgroups.Resources.MemorySwap = *r.Memory.Swap | |||
config.Cgroups.Resources.PidsLimit = r.Pids.Limit | |||
|
|||
if intelrdt.IsEnabled() && config.IntelRdt != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to just set the value and let the libcontainer rtd code validate and return an error if its not enabled? Right now, it would just fail silently and no one would know if its set or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made changes according to your suggestion to avoid silent failure.
There are several cases in error paths:
(1) IsEnabled == false && intelRdt is specified in config
(2) IsEnabled == true && intelRdt is not specified in config
(3) IsEnabled == true && intelRdt is specified in config && update value is invalid
(1): It is handled by validate.Validator(). runc create/run will fail. It never pass to update command.
(2): We'd better return an error that the update command is failed before container.Set(). Just passing through container.Set() can't handle this case:
(3): We just set the value, intelrdt.Set() will return an error that the update value is invalid.
5c51b38
to
6056263
Compare
You are making things wayyy more complicated than they need to be. if val := context.String("l3-cache-schema"); val != "" {
config.IntelRtd = &configs.IntelRdt{
L3CacheSchema: val,
}
} or if you have to check if it already exists in the config: if val := context.String("l3-cache-schema"); val != "" {
if config.IntelRtd == nil {
config.IntelRtd = &configs.IntelRtd{}
}
config.IntelRtd.L3CacheSchema = val
} |
6056263
to
1fe4b4e
Compare
Thank you. I have updated the code according to your suggestion. |
update.go
Outdated
@@ -254,6 +260,17 @@ other options are ignored. | |||
config.Cgroups.Resources.MemorySwap = *r.Memory.Swap | |||
config.Cgroups.Resources.PidsLimit = r.Pids.Limit | |||
|
|||
if val := context.String("l3-cache-schema"); val != "" { | |||
// Avoid silent failure in container.Set() | |||
if !intelrdt.IsEnabled() || config.IntelRdt == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would config.IntelRdt == nil
be a case for an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would config.IntelRdt == nil be a case for an error?
Sorry, this is not straightforward.
- If linux.intelRdt is not specified in config.json, always config.IntelRdt == nil (https://github.com/opencontainers/runc/blob/master/libcontainer/specconv/spec_linux.go#L253-L258)
- In update.go, config := container.Config() (https://github.com/opencontainers/runc/blob/master/update.go#L150), we get
config.IntelRdt == nil
, which implies linux.intelRdt is not specified in config (even though intelrdt.IsEnabled() == true). - In my understanding, this is an error case in updateCommand because intelRdt is not managed in container processes management life cycle - no Apply(), Destroy() and etc.,
- If it passes to container.Set(), when linux.intelRdt is not specified in config.json, intelRdtManager == nil, IntelRdtManager.Set() will never be invoked to update config or to check the input (https://github.com/opencontainers/runc/blob/master/libcontainer/container_linux.go#L208-L216), and this is a silent failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a bug to me. If someone wants to add an intelrdt limit after the fact (where they didn't specify any options in the original configuration) that should be possible -- is this a kernel bug or just a limitation of runc
at the moment? We allow you to do that with all of the other cgroups from memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a bug to me. If someone wants to add an intelrdt limit after the fact (where they didn't specify any options in the original configuration) that should be possible -- is this a kernel bug or just a limitation of
runc
at the moment? We allow you to do that with all of the other cgroups from memory.
This is what I intended to do for Intel RDT/CAT support in runc
as a tradeoff. In my opinion, this is neither a kernel bug nor a runc limitation. Let me explain here:
Unlike cgroups' hierarchy, Intel RDT resource control filesystem in kernel supports only single level filesystem layout. (see details in section "Intel RDT "resource control" filesystem hierarchy" in #433 ). For Cache Allocation Technology (CAT) feature, it only supports limited number of groups. It is hardware limitation by nature. The limitation is indicated in /sys/fs/resctrl/info/L3/num_closids.
In runc
case, we could only create limited number <container_id> directories under /sys/fs/resctrl via Apply()
. So we need a tradeoff: we Apply()
intelrdt constraint on a container only if hardware and kernel support Intel RDT feature and intelRdt is specified in original configuration. Otherwise, if hardware and kernel support the feature, we will run out of the limitation quickly when started a lot of containers instances, even though the user doesn't care about Intel RDT/CAT feature or doesn't want to add intelrdt constraint for some container instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiaochenshen Alright, that makes sense (as in why it's not that way at the moment) -- but that just sounds like we need code to create a new rdtgroup
if we're doing an update? I think just using Apply
might be enough, but I would have to check the code again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that just sounds like we need code to create a new rdtgroup if we're doing an update? I think just using Apply might be enough, but I would have to check the code again.
I will have a look how to do it in update command. Thank you for the suggestion.
update.go
Outdated
} | ||
if config.IntelRdt == nil { | ||
config.IntelRdt = &configs.IntelRdt{} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crosbymichael
This if condition if config.IntelRdt == nil
is always false, because error returns above when config.IntelRdt == nil
. I will remove the if condition lines.
Then the code will look like:
Action: func(context *cli.Context) error {
...
if val := context.String("l3-cache-schema"); val != "" {
// Avoid silent failure in container.Set()
if !intelrdt.IsEnabled() || config.IntelRdt == nil {
return fmt.Errorf("l3 cache schema is not enabled or not specified in config, update value '%s' is discarded", val)
}
config.IntelRdt.L3CacheSchema = val
}
return container.Set(config)
},
1fe4b4e
to
b699e7e
Compare
update.go
Outdated
if val := context.String("l3-cache-schema"); val != "" { | ||
// Avoid silent failure in container.Set() | ||
if !intelrdt.IsEnabled() || config.IntelRdt == nil { | ||
return fmt.Errorf("l3 cache schema is not enabled or not specified in config, update value '%s' is discarded", val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error should be shorter, just remove the last bit , update value '%s' is discarded
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for review. This sounds fine to me.
In current implementation: Either Intel RDT is not enabled by hardware and kernel, or intelRdt is not specified in original config, we don't init IntelRdtManager in the container to handle intelrdt constraint. It is a tradeoff that Intel RDT has hardware limitation to support only limited number of groups. This patch makes a minor change to support update command: Whether or not intelRdt is specified in config, we always init IntelRdtManager in the container if Intel RDT is enabled. If intelRdt is not specified in original config, we just don't Apply() to create intelrdt group or attach tasks for this container. In update command, we could re-enable through IntelRdtManager.Apply() and then update intelrdt constraint. Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add runc update command support for Intel RDT/CAT. for example: runc update --l3-cache-schema "L3:0=f;1=f" <container-id> Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
b699e7e
to
65918b0
Compare
@crosbymichael @cyphar NOTE: The first patch makes a minor change to support update command: In the second patch, in update command, we could re-enable through IntelRdtManager.Apply() and then update intelrdt constraint. |
Ping @crosbymichael @cyphar Could you have a look at your convenience? Thank you. |
1 similar comment
This fixes a GetStats() issue introduced in opencontainers#1590: If Intel RDT is enabled by hardware and kernel, but intelRdt is not specified in original config, GetStats() will return error unexpectedly because we haven't called Apply() to create intelrdt group or attach tasks for this container. As a result, runc events command will have no output. Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Add runc update command support for Intel RDT/CAT.
for example:
runc update --l3-cache-schema "L3:0=f;1=f" container-id
For more information about Intel Resource Director Technology (RDT)
and Cache Allocation Technology (CAT), please refer to #1279
Signed-off-by: Xiaochen Shen xiaochen.shen@intel.com