Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint: check if system supports pre-dumping #1371

Merged
merged 1 commit into from
Apr 12, 2017

Conversation

adrianreber
Copy link
Contributor

Instead of relying on version numbers it is possible to check if CRIU
actually supports certain features. This introduces an initial
implementation to check if CRIU and the underlying kernel actually
support dirty memory tracking for memory pre-dumping.

Upstream CRIU also supports the lazy-page migration feature check and
additional feature checks can be included in CRIU to reduce the version
number parsing. There are also certain CRIU features which depend on one
side on the CRIU version but also require certain kernel versions to
actually work. CRIU knows if it can do certain things on the kernel it
is running on and using the feature check RPC interface makes it easier
for runc to decide if the criu+kernel combination will support that
feature.

Signed-off-by: Adrian Reber areber@redhat.com

@rh-atomic-bot
Copy link

257/264 passed on RHEL - Failed.
253/262 passed on CentOS - Failed.
262/263 passed on Fedora - Failed.
Log - https://aos-ci.s3.amazonaws.com/opencontainers/runc/runc-integration-tests-prs/293/fullresults.xml

@cyphar
Copy link
Member

cyphar commented Mar 17, 2017

/cc @avagin


logrus.Debugf("Feature check says: %s", c.criuFeat)
result := reflect.ValueOf(c.criuFeat)
request := reflect.ValueOf(criuFeat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need all this magic with reflect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I do not know what the RPC answer from CRIU will include. It depends on the CRIU version which features it can support and what the answer will include. To not hardcode all possible values in this function I am using reflection to understand which parameters are in the answer and if they match what I wanted to know from CRIU.

It is not really the most readable code, but I do not know how to flexible react on RPC answers without reflect.

The following is an example how it works. Not realistic but good enough to demonstrate how it should work. Instead of asking only for memory_tracking I am also asking for lazy_pages support in a CRIU version which does not know about the lazy RPC feature:

DEBU[0000] Using CRIU 20300 at: /usr/sbin/criu
DEBU[0000] Using CRIU with following args: [swrk 3]
DEBU[0000] Using CRIU in FEATURE_CHECK mode
DEBU[0000] Feature check says: type:FEATURE_CHECK success:true features:<mem_track:true >
DEBU[0000] Feature check says: mem_track:true
DEBU[0000] CRIU does not support requested feature LazyPages
DEBU[0000] CRIU request option LazyPages with value true
DEBU[0000] CRIU result option LazyPages with value false
ERRO[0000] container is not destroyed
ERRO[0000] CRIU is missing features

CRIU is missing features

So, although I am asking for MemTrack and LazyPages, CRIU answers only with MemTrack (as it does not know anything about LazyPages) and the checkFeature code can handle the answer, even if not all expected fields are filled out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianreber I'm not sure about this way. I would prefer to update criurpc and access known fields directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianreber Another idea is to change a criu rpc format how these features are reported. We can return array of strings with all available features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, thinking about this.

If running with an older CRIU version which does have fewer feature check options... What will happen if I try to access those non-existing fields? I guess that is something which will just work thanks to protobuf.

Not sure about your array of strings idea. That sounds not like something which should be done as protobuf should do it for us.

I will try to change it to explicitly only verify features it actually knows about and remove the reflection code.

@adrianreber
Copy link
Contributor Author

Hmm, I pushed more commits than I wanted. Something was wrong with my rebase.

@adrianreber
Copy link
Contributor Author

Another git push removed the additional commits. Now it is correct.

@adrianreber
Copy link
Contributor Author

@avagin: This version is much simpler and readable. No more reflection used.

@rh-atomic-bot
Copy link

268/275 passed on RHEL - Failed.
259/273 passed on CentOS - Failed.
273/274 passed on Fedora - Failed.
Log - https://aos-ci.s3.amazonaws.com/opencontainers/runc/runc-integration-tests-prs/313/fullresults.xml

@@ -43,6 +43,7 @@ type linuxContainer struct {
criuVersion int
state containerState
created time.Time
criuFeat *criurpc.CriuFeatures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

criuFeat isn't a container property, so I think we can have a global variable for it. What do you think about this idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does not really belong in the linuxContainer struct. Let me try it as a global variable.

Instead of relying on version numbers it is possible to check if CRIU
actually supports certain features. This introduces an initial
implementation to check if CRIU and the underlying kernel actually
support dirty memory tracking for memory pre-dumping.

Upstream CRIU also supports the lazy-page migration feature check and
additional feature checks can be included in CRIU to reduce the version
number parsing. There are also certain CRIU features which depend on one
side on the CRIU version but also require certain kernel versions to
actually work. CRIU knows if it can do certain things on the kernel it
is running on and using the feature check RPC interface makes it easier
for runc to decide if the criu+kernel combination will support that
feature.

Feature checking was introduced with CRIU 1.8. Running with older CRIU
versions will ignore the feature check functionality and behave just
like it used to.

v2:
 - Do not use reflection to compare requested and responded
   features. Checking which feature is available is now hardcoded
   and needs to be adapted for every new feature check. The code
   is now much more readable and simpler.

v3:
 - Move the variable criuFeat out of the linuxContainer struct,
   as it is not container specific. Now it is a global variable.

Signed-off-by: Adrian Reber <areber@redhat.com>
@adrianreber
Copy link
Contributor Author

@avagin : now with a global variable

@avagin
Copy link
Contributor

avagin commented Apr 6, 2017

LGTM

Approved with PullApprove

@adrianreber
Copy link
Contributor Author

ping - any other reviewers?

@crosbymichael
Copy link
Member

crosbymichael commented Apr 12, 2017

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit f8ce01d into opencontainers:master Apr 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants