Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error messages in validation cgroup tests #605

Merged

Conversation

alban
Copy link
Contributor

@alban alban commented Mar 14, 2018

Symptoms:

 ok 3 - blkio weight is set correctly
 # expect: 500, actual: 842352267696

After the patch:

 ok 3 - blkio weight is set correctly
 # expect: 500, actual: 500

I found those issues by patching runc to introduce "off-by-one" bugs and
checking if they are correctly caught by the validation tests:
https://github.com/kinvolk/runc/commits/alban/off-by-one-bugs

Signed-off-by: Alban Crequy alban@kinvolk.io

Symptoms:
```
 ok 3 - blkio weight is set correctly
 # expect: 500, actual: 842352267696
```

After the patch:
```
 ok 3 - blkio weight is set correctly
 # expect: 500, actual: 500
```

I found those issues by patching runc to introduce "off-by-one" bugs and
checking if they are correctly caught by the validation tests:
https://github.com/kinvolk/runc/commits/alban/off-by-one-bugs

Signed-off-by: Alban Crequy <alban@kinvolk.io>
@liangchenye
Copy link
Member

liangchenye commented Mar 15, 2018

LGTM

Approved with PullApprove

@liangchenye liangchenye merged commit 4492c1d into opencontainers:master Mar 15, 2018
wking added a commit to wking/ocitools-v2 that referenced this pull request Apr 6, 2018
The previous values were giving me:

  container_linux.go:348: starting container process caused
    "process_linux.go:402: container init caused
    \"process_linux.go:367: setting cgroup config for procHooks
    process caused \\\"failed to write 56892210544640 to
    hugetlb.1GB.limit_in_bytes: open
    /sys/fs/cgroup/hugetlb/.../hugetlb.1GB.limit_in_bytes: permission
    denied\\\"\""

The previous values are originally from 432615a (add cgroup hugetlb
test for runtime, 2017-12-05, opencontainers#93), which doesn't motivate their
choice.  The new values are copy/pasted from the spec [1] (which
doesn't motivate its choice either ;).  I've kept something like
Alban's comment from 984dbc8 (Fix error messages in validation cgroup
tests, 2018-03-14, opencontainers#605) to at least explain where the number comes
from.

In testing with my local system, the issue seems to be pageSize and
not the limit value.  That seems to be supported by the kernel docs,
which have [2]:

	hugepages=	[HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
	hugepagesz=	[HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
			On x86-64 and powerpc, this option can be specified
			multiple times interleaved with hugepages= to reserve
			huge pages of different sizes. Valid pages sizes on
			x86-64 are 2M (when the CPU supports "pse") and 1G
			(when the CPU supports the "pdpe1gb" cpuinfo flag).

My CPU supports both:

  $ cat /proc/cpuinfo | grep '^flags' | head -n1 | grep -o ' \(pse\|pdpe1gb\) '
   pse
   pdpe1gb

but I don't set hugepagesz, and I seem to only get 2M by default.  I
can get 1GB entries by booting with hugepagesz=1GB.  Longer-term, we
may want to auto-detect the value(s) currently enabled by the host
system, but for this commit I'm hard-coding 2MB.

[1]: https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config-linux.md#example-8
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v4.16#n1336

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/ocitools-v2 that referenced this pull request Apr 6, 2018
The previous values were giving me:

  container_linux.go:348: starting container process caused
    "process_linux.go:402: container init caused
    \"process_linux.go:367: setting cgroup config for procHooks
    process caused \\\"failed to write 56892210544640 to
    hugetlb.1GB.limit_in_bytes: open
    /sys/fs/cgroup/hugetlb/.../hugetlb.1GB.limit_in_bytes: permission
    denied\\\"\""

The previous values are originally from 432615a (add cgroup hugetlb
test for runtime, 2017-12-05, opencontainers#93), which doesn't motivate their
choice.  The new values are copy/pasted from the spec [1] (which
doesn't motivate its choice either ;).  I've kept something like
Alban's comment from 984dbc8 (Fix error messages in validation cgroup
tests, 2018-03-14, opencontainers#605) to at least explain how the limit breaks
down.

In testing with my local system, the issue seems to be pageSize and
not the limit value.  That seems to be supported by the kernel docs,
which have [2]:

  hugepages=  [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
  hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
      On x86-64 and powerpc, this option can be specified
      multiple times interleaved with hugepages= to reserve
      huge pages of different sizes. Valid pages sizes on
      x86-64 are 2M (when the CPU supports "pse") and 1G
      (when the CPU supports the "pdpe1gb" cpuinfo flag).

My CPU supports both:

  $ cat /proc/cpuinfo | grep '^flags' | head -n1 | grep -o ' \(pse\|pdpe1gb\) '
   pse
   pdpe1gb

but I don't set hugepagesz, and I seem to only get 2M by default.  I
can get 1GB entries by booting with hugepagesz=1GB.  Longer-term, we
may want to auto-detect the value(s) currently enabled by the host
system, but for this commit I'm hard-coding 2MB.

[1]: https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config-linux.md#example-8
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v4.16#n1336

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/ocitools-v2 that referenced this pull request Apr 7, 2018
The previous values were giving me:

  container_linux.go:348: starting container process caused
    "process_linux.go:402: container init caused
    \"process_linux.go:367: setting cgroup config for procHooks
    process caused \\\"failed to write 56892210544640 to
    hugetlb.1GB.limit_in_bytes: open
    /sys/fs/cgroup/hugetlb/.../hugetlb.1GB.limit_in_bytes: permission
    denied\\\"\""

The previous values are originally from 432615a (add cgroup hugetlb
test for runtime, 2017-12-05, opencontainers#93), which doesn't motivate their
choice.  The new values are copy/pasted from the spec [1] (which
doesn't motivate its choice either ;).  I've kept something like
Alban's comment from 984dbc8 (Fix error messages in validation cgroup
tests, 2018-03-14, opencontainers#605) to at least explain how the limit breaks
down.

In testing with my local system, the issue seems to be pageSize and
not the limit value.  That seems to be supported by the kernel docs,
which have [2]:

  hugepages=  [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
  hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
      On x86-64 and powerpc, this option can be specified
      multiple times interleaved with hugepages= to reserve
      huge pages of different sizes. Valid pages sizes on
      x86-64 are 2M (when the CPU supports "pse") and 1G
      (when the CPU supports the "pdpe1gb" cpuinfo flag).

My CPU supports both:

  $ cat /proc/cpuinfo | grep '^flags' | head -n1 | grep -o ' \(pse\|pdpe1gb\) '
   pse
   pdpe1gb

but I don't set hugepagesz, and I seem to only get 2M by default.  I
can get 1GB entries by booting with hugepagesz=1GB.  Longer-term, we
may want to auto-detect the value(s) currently enabled by the host
system, but for this commit I'm hard-coding 2MB.

[1]: https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config-linux.md#example-8
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt?h=v4.16#n1336

Signed-off-by: W. Trevor King <wking@tremily.us>
@dongsupark dongsupark deleted the alban/errors-typos branch May 22, 2018 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants