Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isolate is significantly slower with CG enabled #126

Open
sadfun opened this issue Jun 14, 2023 · 10 comments
Open

isolate is significantly slower with CG enabled #126

sadfun opened this issue Jun 14, 2023 · 10 comments

Comments

@sadfun
Copy link
Contributor

sadfun commented Jun 14, 2023

On fresh Ubuntu 22.04 installation, working with isolate in cgroups mode is incomparably slower than without it. This work both for master and cg2 branches.

# time isolate --init
/var/local/lib/isolate/0

real	0m0.001s
user	0m0.000s
sys	0m0.000s
# time isolate --init --cg
/var/local/lib/isolate/0

real	0m0.014s
user	0m0.001s
sys	0m0.000s

Same for run:

time isolate --run /usr/bin/echo

OK (0.000 sec real, 0.000 sec wall)

real	0m0.001s
user	0m0.001s
sys	0m0.000
time isolate --run --cg /usr/bin/echo

OK (0.000 sec real, 0.023 sec wall)

real	0m0.025s
user	0m0.002s
sys	0m0.000s

Is there any workaround to fix it, maybe changes to isolate's code or OS tweaking?

@sadfun sadfun changed the title isolate is significantly slow with CG enableb isolate is significantly slow with CG enabled Jun 14, 2023
@sadfun sadfun changed the title isolate is significantly slow with CG enabled isolate is significantly slower with CG enabled Jun 14, 2023
@fushar
Copy link
Member

fushar commented Jun 14, 2023

I am using Isolate with CG on Ubuntu 22.04 and do not observe such slowdown.

I disabled cgroups v2, though, using GRUB's systemd.unified_cgroup_hierarchy=false config. Can you try disabling it and see if you still have the issue?

Note -- I am just an Isolate user, not maintainer. I honestly do not know how cgroups v1 vs v2 affects the correctness of execution sandboxing, at least for my project (online judge for competitive programming).

@sadfun
Copy link
Contributor Author

sadfun commented Jun 14, 2023

Yep, i tried isolate is two modes:

  • binary from master with GRUB options cgroup_enable=memory systemd.unified_cgroup_hierarchy=0
  • binary from cg2 without GRUB options

Both of them add +15-25 ms to init sandbox or run a command with --cg mode, both of them do it quickly (~1ms) without CG.

@gollux
Copy link
Member

gollux commented Jun 18, 2023

Can you use strace -T to find out in which system calls is the time spent?

BTW do you have an application where this makes a difference?

@sadfun
Copy link
Contributor Author

sadfun commented Jun 18, 2023

Here it is: isolate-init.txt, isolate-run.txt. This is the version from master branch.

As I see, during --init, the longest syscall is mkdir("/sys/fs/cgroup/cpuset/box-0/", 0777) = 0 <0.013080> (13ms).
And during --run it is wait4(53003, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={tv_sec=0, tv_usec=815}, ru_stime={tv_sec=0, tv_usec=0}, ...}) = 53003 <0.019579> (19ms).

An application where this really makes a difference is high-load online judge – when you have many small tests, on each of which the solutions usually work for 1-2 ms, increasing the judging time by 10-20 times is a bottleneck :(

@sadfun
Copy link
Contributor Author

sadfun commented Jun 26, 2023

It seems that this slowdown is not caused by the isolate itself, but by the implementation of cgroups v1/v2.

One idea to improve the situation is to add a soft cleanup that doesn't remove the entire sandbox, but resets it to its original state without re-creating the control group. This will at least eliminate the --init overhead for all cases of online judges.

@fushar, you said that you do not experience such slowdowns, but it seems that such delays are present on every Linux. Maybe the 30-40ms difference is just not noticeable in your case? Could you please measure it?

@gollux
Copy link
Member

gollux commented Jul 4, 2023

I will try profiling it using a system-wide profiler, but at the moment, stabilizing and merging support for cgroup2 has higher priority.

@sadfun
Copy link
Contributor Author

sadfun commented Jul 4, 2023

Surely! For the future: as profiled by @purplesyringa with strace -ff -T, it is seen that the heaviest operation in --run is actually moving process to cgroup:

[pid 119538] openat(AT_FDCWD, "/sys/fs/cgroup/memory/box-0/tasks", O_WRONLY|O_TRUNC) = 3 <0.000010>
[pid 119538] write(3, "2\n", 2)         = 2 <0.016667>
...
[pid 119538] openat(AT_FDCWD, "/sys/fs/cgroup/cpuset/box-0/tasks", O_WRONLY|O_TRUNC) = 3 <0.000007>
[pid 119538] write(3, "2\n", 2)         = 2 <0.015644>

@AlexVasiluta
Copy link
Contributor

AlexVasiluta commented Dec 16, 2023

Hello! I wanted to chime in and suggest that maybe the clone3 syscall with CLONE_INTO_CGROUP (cg2 branch only, though, but it isn't a problem) might yield a small performance improvement in adding the process into the cgroup, since it would probably "get to know" faster the environment it's in. I have not tested this idea, but logically it would make sense.

Despite not having an official glibc wrapper function, the clone3 syscall is available from kernel 5.2 (5.7 with CLONE_INTO_CGROUP) onwards. Since isolate requires 5.19 (as stated in the manual) for properly reporting memory usage, I think we could make use of this feature.

@AlexVasiluta
Copy link
Contributor

Update: according to the manpages, CLONE_INTO_CGROUP and using the clone3 syscall would fix this issue:

Furthermore, spawning the child process directly into a target cgroup is significantly cheaper than moving the child process into the target cgroup after it has been created.

https://www.man7.org/linux/man-pages/man2/clone.2.html

@gollux
Copy link
Member

gollux commented Feb 28, 2024

Thanks for the idea, but I'm going to postpone it for a while, because I was sitting on the cgroup v2 version for too long and I would like to release it soon. Also, CLONE_INTO_CGROUP is not supported by glibc yet and calling syscalls directly could be non-portable ... need to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants