-
Notifications
You must be signed in to change notification settings - Fork 141
'atomic images list' can break if system container is missing files/partially installed #1191
Comments
Thanks for the report, /me currently investigating it :-). |
Update: seems like we trigger the failure when we mark the images to be used. (code below)
The way how we check image to be used is to check if the image name shows up in containers' info file.(e.g: /var/lib/containers/atomic/kube-apiserver/info for above example). If it is, then we mark the image to be used. Therefore when the info file is not there, a failure would happen. So yea, I believe it is an error on our end. The expected behavior should be the following( I think): 1: When images is pulled to the host, we should always be able to list them no matter whether its containers are valid or not. However, we could output a warning about missing info file for containers. 2: When marking images to be used, we can skip "marking used" step if one container is found non-valid?(But that would mean we need to verify containers first, which would lead to performance issues...) So, WDYT? @giuseppe P.S: I don't mind working on the fix, just wanted to ask for opinions before doing so :-P. |
@peterbaouoft if the info file cannot be opened then we need to skip the container, i.e. we need to catch the exception from If the image cannot be marked as used, then it is not a big problem as the container is not correctly installed. |
As reported by projectatomic#1191, when info file is missing, atomic images list will unexpectedly fail. This patch is therefore to catch the error and instead output a warning to notify user.
Hmm, thinking more about it tho, this removal of info file error needs to be carefully handled.It will impact all other container related actions as well. I found 3 atm, and I think at least Below are the examples of possible failures introduced with the above error:
|
yes, I agree with you that |
As reported by projectatomic#1191, when info file is missing, atomic images list will unexpectedly fail. This patch is therefore to catch the error and instead output a warning to notify user. This patch also adds an error handling case for following commands: 1: atomic uninstall $container 2: atomic containers update $container 3: atomic containers rollback $container Now instead of erroring out, a warning is added to notify the user about the failure
So yea... the issue can't seem to be easily solved without changing architecture of the code(not sure if it is the right way to say it). I mentioned briefly about the ways I came up with in the PR. TL;DR, to make users to be able to perform The following is the content: There seemed to be an easy method IMO(haven't tried it yet). In However, as mentioned in the PR, other installed files locations are tracked inside the info file as well, so we are unable to remove those. To acknowledge that, we might want a persistent storage to track those file location? (not sure if it is worth it to track the corner case tho). Willing to hear your thoughts and maybe suggestions on this, and thanks in advance! :) |
@peterbaouoft can you just ignore the issue when info is not found during "uninstall" and let the uninstall continue (i.e. disable/remove the systemd files + drop the checkouts)? |
Hi thanks for the reply, IIUC, both As a result, the container will not be included and I can first start the latter implementation, it seems not going to take a long time to do. |
In the error situation covered by RHBZ#1542144, it is possible to end up with a system container that is only partially installed.
In this partial-install state, the
atomic images list
functionality breaks and is unable to show any of the images on the host.This can be reproduced more simply by just installing a system container and then removing the
info
file from the directory:The text was updated successfully, but these errors were encountered: