-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial checkpoint support #641
Conversation
5c4fca1
to
7ae3c3c
Compare
Codecov Report
@@ Coverage Diff @@
## main #641 +/- ##
==========================================
- Coverage 69.69% 69.21% -0.49%
==========================================
Files 84 85 +1
Lines 10983 11059 +76
==========================================
- Hits 7655 7654 -1
- Misses 3328 3405 +77 |
7ae3c3c
to
e62e325
Compare
Hmm, the test succeeds locally with runc. Need to figure out why it does not work in CI. |
78cde5e
to
e9b0ed9
Compare
Tests are still failing as I only tested on a cgroup v2 system. v1 needs some additional code. |
1dd0f57
to
63105b0
Compare
} | ||
|
||
let directory = std::fs::File::open(&opts.image_path)?; | ||
criu.set_images_dir_fd(directory.as_raw_fd()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to create the directory specified with image_path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to create the directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to create it, but we could if it does not exist. I think in Podman we create it to make sure it gets created with 0600. We could do it here. Or not. I have not strong opinion. If it does not exist the user will see an error. By creating the directory we could avoid that error. On the other hand, if it does not exist then the user has to make a conscious decision where the checkpoint directory is. I have a couple of checkpoint directories all around my development system because the directory is automatically created. Both options have advantages and disadvantages.
We could leave it for now and if users are complaining then it can still be added later. Or if one of the container engines needs it, it can also be added later.
63105b0
to
087fa93
Compare
Ready for review. All tests are successful. |
Thanks for your PR. But I don't have the time to review this PR on my weekday. So I'll check on next holiday, please wait just a moment 🙏 |
No problem. |
} | ||
|
||
let directory = std::fs::File::open(&opts.image_path)?; | ||
criu.set_images_dir_fd(directory.as_raw_fd()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to create the directory?
Did not manage to go through everything today, so feel free to wait until I am finished. |
087fa93
to
8d44706
Compare
Changed as suggested by @Furisto |
8d44706
to
605766b
Compare
pub fn checkpoint(args: Checkpoint, root_path: PathBuf) -> Result<()> { | ||
log::debug!("start checkpointing container {}", args.container_id); | ||
let mut container = load_container(root_path, &args.container_id)?; | ||
let opts = libcontainer::container::CheckpointOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you implement impl From<CheckpointOption> from Checkpoint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this question. Maybe my rust knowledge is not good enough. Can you be more specific about what needs to change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrianreber I'm sorry for my inadequate explanation. Why don't you implement this trait? I feel it would be more Rust-like if the conversion from Checkpoint
to CheckpointOptions
could be done using from()
. WDYT?
https://doc.rust-lang.org/std/convert/trait.From.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. I tried it, but I am not sure where. If I try to implement it in the youki create I get:
impl doesn't use only types from inside the current crate
If I add it to the liboci_cli crate I need to access information from libcontainer and also the other way around. Not sure if that is desired or even possible.
Any recommendations where I need to add the corresponding implementation?
Great idea. I think we can represent the WIP by adding it to the feature table in the README.md, WDYT?
|
@@ -36,9 +36,14 @@ jobs: | |||
uses: Swatinem/rust-cache@v1 | |||
- run: sudo apt-get -y update | |||
- run: sudo apt-get install -y pkg-config libsystemd-dev libdbus-glib-1-dev libelf-dev libseccomp-dev | |||
- name: Install runc 1.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to install this version of runc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runc 1.1.0 changed how it handles --work-dir
(or --work-path
). Before 1.1.0 runc would create a directory called criu-work
if the user does not specify a working directory. CRIU's default behaviour is to put the log and statistics files in the image path if work dir is not set. With runc 1.1.0 runc behaves like CRIU. That is also what I did for this PR and therefore the tests do not pass with older versions of runc.
In this PR or in another PR? |
@adrianreber Can I ask for this PR? |
605766b
to
a0aef0e
Compare
Did most of the suggested changes (besides the one with |
0..2 does not include 2. Change it to 0..3 to include 2. Signed-off-by: Adrian Reber <areber@redhat.com>
This adds the first code to checkpoint a container. The checkpoint command is name 'checkpointt' (with two 't's at the end) so that container engines like Podman do not think to use this not yet finished checkpoint restore implementation. For Podman it is still necessary to tell CRIU that the network namespace is external at least and restoring needs special handling to support '--console-socket'. Signed-off-by: Adrian Reber <areber@redhat.com>
If stdin is not pointing to /dev/null checkpointing fails for now. Just point it to /dev/null. Signed-off-by: Adrian Reber <areber@redhat.com>
This still uses the subcommand 'checkpointt' until it works in combination with Podman. Signed-off-by: Adrian Reber <areber@redhat.com>
a0aef0e
to
0547fc2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrianreber LGTM 💯
Thanks!!!
This is the initial draft for checkpoint support. No restore support yet.
Tests are still missing and it relies on the not yet officially release Rust CRIU bindings. For now it is just a draft.
It depends on #632
I called the checkpoint command ´checkpointt' with two
t
s so that Podman does not think youki already fully supports checkpoint/restore. Once everything works I will rename it.