-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
image-pruner: prune images in their own jobs #19468
Conversation
Publishing this in early state to gather some feedback. @dmage, @legionus, @bparees, @coreydaley PTAL |
Dynamically updated graph will not help to achieve greater consistency. You made a very comlicate code and it will speed up the work of the prunner, but this does not help with maintaining the integrity of the database. Removing an image is not atomic and in case of an error you will break the database. In connection, I believe that this approach is not effective. I would suggest a different approach to prunning: |
return | ||
} | ||
out <- *w.prune(job) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for job := range in {
out <- *w.prune(job)
}
In this case you don't need nil sentinels and you need just to close the channel to stop processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll check that out.
Update: rewritten
|
||
return true | ||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need select there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supposed to handle events either from workers, image streams listener or image listener (which are not there yet).
|
||
// UnreferencedImageComponentEdgeKind is an edge from an ImageNode to an ImageComponentNode denoting that | ||
// the component is currently being unreferenced in a running job. | ||
UnreferencedImageComponentEdgeKind = "UnreferencedImageComponentToDelete" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels very artificial. Why do we need these "negative" references?
Can we structure the pruner such a way that we remove only nodes which don't have any references?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see now that it makes the alg. more complex without having a value. I wanted to somehow track the blobs being deleted (in the jobs running right now) inside the graph, but it just adds inefficiency and complexity. The tracking can be easily done outside of the graph.
If it weren't for the parallelism, the tracking wouldn't be necessary at all.
Can we structure the pruner such a way that we remove only nodes which don't have any references?
To answer that, we need to first answer https://github.com/openshift/origin/pull/19468/files/99829b95497e6c39d0bdafc4fa00b6f017e23a6e#r183420937. If we continue to stick with the current behaviour of keeping the image if any error for its components happens, then the unreferencing won't happen until the objects are deleted.
But I agree it would be more natural and it would simplify the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on tracking this elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworked and simplified a bit.
return | ||
} | ||
|
||
func strenghtenReferencesFromFailedImageStreams(g genericgraph.Graph, failures []Failure) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/strenghten/strengthen/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not going to believe it, but my offline dictionary contains both 😄. I cannot find it online though so I'm going to trust you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
p.g.AddEdge(imageStreamNode, s, ReferencedImageManifestEdgeKind) | ||
break | ||
default: | ||
panic(fmt.Sprintf("unhandeled image component type %q", cn.Type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return this error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't expect this to ever fire. But sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also should be "unhandled"
but yeah, i don't think we ever want to panic, even in the cli.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
res := &JobResult{Job: job} | ||
|
||
// If namespace is specified prune only ImageStreams and nothing more. | ||
if len(w.algorithm.namespace) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When may this condition become true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, the statement was misplaced. It's removed now.
@legionus I don't see how The aim of this PR is not to make the pruning safer. The purpose is to tolerate errors that prevent customers from pruning anything by doing the pruning iteratively - image after image which brings a vague atomicity to the pruning. I don't think this change prevents us from making the pruning safer in the future. It does not prevents us from implementing |
@@ -333,6 +356,7 @@ func (p *pruner) addImagesToGraph(images *imageapi.ImageList) []error { | |||
// | |||
// addImageStreamsToGraph also adds references from each stream to all the | |||
// layers it references (via each image a stream references). | |||
// TODO: identify streams with non-existing images for later cleanup | |||
func (p *pruner) addImageStreamsToGraph(streams *imageapi.ImageStreamList, limits map[string][]*kapi.LimitRange) []error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function always returns nil
. In case of an error, it panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Haven't even realized that. I'm more leaning towards the panics for errors not aimed at the end user but a developer (saying, you're code is buggy, go fix it) such as this one. But I don't feel that strong about it, so I'll look into returning it as an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error reported now without a panic.
case *imagegraph.ImageStreamNode: | ||
// ignore | ||
default: | ||
panic(fmt.Sprintf("unhandeled graph node %t", d.Node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it just attempts to make it error tolerable
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can return the error for sure. This was meant rather as a debug statement until all the types got handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not panicking any more.
|
||
if len(res.Failures) > 0 { | ||
// TODO: include image as a failure as well for the sake of summary's completness | ||
return res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If w.algorithm.pruneRegistry == true
then here you will have broken Image
. This image will not have blobs, but the registry will assume that these objects are there. This is a very bad condition.
It will be more correct to delete the image and then if there are no errors try to delete the objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I'm keeping the current behaviour that preserves image objects for future prunes so that blobs that failed to be pruned are recognized on the subsequent prunes. If we remove the image, we loose the option to prune its blobs using this kind of pruner.
But since we have a hard-pruner now, we could do the removal. I just hesitate to make it a default behaviour. Maybe a flag --keep-broken-images
could be useful here. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A scenario speaking against the image deletion regardless of prio failure is that the registry lost write permission to the storage and deletion of each blob fails (already happened to some customers). In this case all images are deleted but all the blobs remain on the storage.
Is it worse than broken images? Hard to tell. It depends on what the customer wants to achieve - either he's running out of storage space or etcd is too big and slow or even both.
Therefor I'm more inclined to having a flag like --keep-broken-images-on-failure
defaulting to true
(current behaviour) so that the customer can choose what he prefers.
@bparees thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you note, hard-prune can be used to resolve that scenario (blobs being left in storage). I don't think we need to introduce new flags for something that i would hope is not a typical scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just afraid that hard-prune is not a very popular solution to this problem and am reluctant to make it a mandatory post-routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it won't be mandatory unless there are other issues in their registry. if we can't delete layers due to storage issues, they're going to have to fix their storage issues anyway... running hard prune again after doing so seems like a small additional burden.
Is there some other case where you think it would be better to leave the image data in place if we cannot remove the blob data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some other case where you think it would be better to leave the image data in place if we cannot remove the blob data?
Just a few similar scenarios where the user wants to prune especially storage like: connection issues to the registry or registry does not see the user as authorized to do the prune.
What about keeping the image only if there were no successful blob deletions? So that broken images get pruned, but healthy images can either be reused or pruned next time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case all images are deleted but all the blobs remain on the storage.
Is it worse than broken images?
@miminar No, broken images mush worse. We have few ways how to remove blobs completely or even restore images in the etcd, but we do not have tools to restore blobs in the storage to fix images. We have only diagnostic tool which can help to find such broken images. User should remove image from etcd and re-push such image by hands to fix them. Until he does this, these images will be broken. There will be errors in push and pull. It seems to me that this is a much worse scenario, because the cluster goes into an inconsistent state and the correction requires manual intervention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about keeping the image only if there were no successful blob deletions?
Went for this one. Please let me know if you have any objections.
return | ||
} | ||
|
||
func strenghtenReferencesFromFailedImageStreams(g genericgraph.Graph, failures []Failure) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs godoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Godoc'd
Rebased and added imagestream event handling. |
Added image event handling. |
/retest |
Ready for review. |
g.internal.SetEdge(t, 1.0) | ||
g.internal.SetEdge(t) | ||
} | ||
case simple.Edge: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that before changes this function was able to handle only genericgraph.Edge. Is this case ever executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concrete.WeightedEdge
used to be handled before. simple.Edge
is its equivalent. But no, we don't use upstream edges in our code directly. Nothing prevents us from doing that in the future though.
resultChan <-chan JobResult, | ||
) (deletions []Deletion, failures []Failure) { | ||
imgUpdateChan := p.imageWatcher.ResultChan() | ||
isUpdateChan := p.imageStreamWatcher.ResultChan() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of adding these watches? Trying to handle the case that a reference to a layer is added to an image or imagestream while we're in the middle of pruning? It seems unrelated to the fundamental purpose of the PR (to parallelize the image pruning operations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of adding these watches? Trying to handle the case that a reference to a layer is added to an image or imagestream while we're in the middle of pruning?
exactly
It seems unrelated to the fundamental purpose of the PR
Yes, it's unrelated, but the way the code is structured, it's now easy to add that safety mechanism. From the past we know that pruning can take hours to complete and that's way too big window for changes and inconsistencies to happen.
For the sake of simplicity (or rather lesser complexity), I can extract that and move to a follow-up if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's fine, just wanted to understand why it was being done here, thanks.
imagegraph "github.com/openshift/origin/pkg/oc/graph/imagegraph/nodes" | ||
) | ||
|
||
// ComponentRetention knows all the places where image componenet needs to be pruned (e.g. global blob store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/componenet/component/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
/approve |
Rebased. /test extended_image_registry |
/hold cancel |
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
Bumped to the kube level. Signed-off-by: Michal Minář <miminar@redhat.com>
Signed-off-by: Michal Minář <miminar@redhat.com>
Instead of pruning in phases: all streams -> all layers -> all blobs -> manifests -> images Prune individual images in parallel jobs: all streams -> parallel [ image1's layers -> image1's blobs -> ... -> image1, image2's layers -> image2's blobs -> ... -> image2, ... ] A failure in streams prune phase is not fatal anymore. Signed-off-by: Michal Minář <miminar@redhat.com>
Rebased |
@deads2k any concerns w/ the bump commit here? |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bparees, miminar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Instead of pruning in phases:
Prune individual images in parallel jobs:
A failure in streams prune phase is not fatal anymore.
Resolves: rhbz#1567657
Additionally, previously manifest blobs weren't removed from the blob store. This PR removes the manifests of deleted images from blob store as well.
Remarks:
TODOs: