-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libct/cg: support hugetlb rsvd #4073
Conversation
ef87488
to
de92768
Compare
This adds support for hugetlb.<pagesize>.rsvd limiting and accounting. The previous non-rsvd max/limit_in_bytes does not account for reserved huge page memory, making it possible for a processes to reserve all the huge page memory, without being able to allocate it (due to cgroup restrictions). In practice this makes it possible to successfully mmap more huge page memory than allowed via the cgroup settings, but when using the memory the process will get a SIGBUS and crash. This is bad for applications trying to mmap at startup (and it succeeds), but the program crashes when starting to use the memory. eg. postgres is doing this by default. This also keeps writing to the old max/limit_in_bytes, for backward compatibility. More info can be found here: https://lkml.org/lkml/2020/2/3/1153 (commit message mostly written by Odin Ugedal) Co-authored-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
de92768
to
4a7d3ae
Compare
I'm on the verge of backporting this to 1.1 -- there is not too much code, and this helps a lot to fix issues with e.g. postgres (see #3859 (comment)) |
if err := cgroups.WriteFile(path, prefix+".rsvd"+suffix, val); err != nil { | ||
if errors.Is(err, os.ErrNotExist) { | ||
skipRsvd = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea here, those .rsvd.
files either exist or not, so if the first such file doesn't exist, we set the skipRsvd=true
and do not try to use .rsvd.
files any more.
if rsvd != "" && errors.Is(err, os.ErrNotExist) { | ||
rsvd = "" | ||
goto again | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For getting the stats, we prefer .rsvd.
files, if they exist.
return err | ||
} | ||
if skipRsvd { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a break
instead, so the loop stops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to stop the loop here. We want to write all hugetlb.XXX.limit_in_bytes
and, if available, all hugetlb.XXX.rsvd.limit_in_bytes
as well (and we're looping over XXX).
Thanks for pushing this @kolyshkin! Life and work happened, and I haven't had much time to spend on runc and kernel stuff after graduating from University. Very nice seeing you pick this up and pushing it over the edge! |
This adds support for
hugetlb.<pagesize>.rsvd
limiting and accounting.The previous non-rsvd max/limit_in_bytes does not account for reserved
huge page memory, making it possible for a processes to reserve all the
huge page memory, without being able to allocate it (due to cgroup
restrictions).
In practice this makes it possible to successfully mmap more huge page
memory than allowed via the cgroup settings, but when using the memory
the process will get a SIGBUS and crash. This is bad for applications
trying to mmap at startup (and it succeeds), but the program crashes
when starting to use the memory. eg. postgres is doing this by default.
This also keeps writing to the old max/limit_in_bytes, for backward
compatibility.
More info can be found here: https://lkml.org/lkml/2020/2/3/1153
(commit message mostly written by @odinuge)
Fixes: #3859.