Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There is a race between firecracker-containerd replacing the stub drive with the actual drive and mounting this drive. When the disk is replaced the kernel will schedule asynchronous work in virtblk_config_changed. In the meantime firecracker-containerd can proceed and already send a mount command to the agent running in the guest. This mount operation will, however, fail because the guest kernel still sees the stub drive with only 512 bytes in size. The resulting error code is a ENOMEM in this case. This commit therefore adds this as an retryable error code to accommodate for this situation.
The issue can be reproduced when an artificial msleep(1000) is added in virtblk_config_changed_work. This produced the following error:
error="failed to get stub drive for task "test": failed to mount
drive inside vm: failed to mount newly patched drive: rpc error: code
= Unknown desc = non-retryable failure mounting drive from
"/dev/vdb" to "/container/test/rootfs": cannot allocate memory"
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.