Warehouses show in progress for whole time of "deployment"? #2868

ddeath · 2024-10-29T10:44:14Z

ddeath
Oct 29, 2024

Hi I noticed that currently when warehouse hits scheduled time it seems it will not only pull new artifacts but it will "promote" them while showing loading indicator on Warehouse:

I would expect that Warehouse refresh will be really quick (it is just 1 api call to container registry?) and then it will "progress" to the Stage and stage will become "in-progress"....

Currently the above refresh takes 52 seconds and after it is done the new version is instantly shown as "promoted"?

I am not sure if this is expected but it confuses me so I asked.

krancour · 2024-10-29T13:54:30Z

krancour
Oct 29, 2024
Maintainer

Hi I noticed that currently when warehouse hits scheduled time it seems it will not only pull new artifacts but it will "promote" them while showing loading indicator on Warehouse

They're not really connected. Warehouses find artifacts, when they find new ones, they produce new Freight, if configured to do so. Any new Freight produced by a Warehouse comes as the end of its reconciliation process.

Stages, if configured to auto-promote, will promote the latest Freight they find.

What you've observed it two different processes that are only indirectly related and just happen to be running at the same time.

I would expect that Warehouse refresh will be really quick (it is just 1 api call to container registry?)

This is not true, unfortunately. When container images are involved, Warehouse reconciliations are some of the slowest and most network-heavy operations in Kargo, but that's heavily dependent on choices you make in configuring your Warehouses.

When a Warehouse with an image subscription reconciles, it does not find the one image that best matches your selection criteria. It finds the last n (n is configurable). The reason for this is that it's very possible that in between two Warehouse reconciliations, multiple new versions of an image that fit your selection criteria have been pushed to the registry. Imagine a scenario where you select the latest 1.0.x image, and in the time since the last reconciliation, 1.0.1 and 1.0.2 were both pushed to the registry. If n were a hard-coded 1, the Warehouse would produce a piece of Freight containing version 1.0.2 and even if you wanted to move 1.0.1 through your pipeline, you'd be unable to. Finding (again, configurable) n > 1 enables a feature called the "Freight Lab" where you can build a piece of Freight yourself by selecting from available artifacts. i.e. Even though the Warehouse wouldn't have automatically produced Freight containing image 1.0.1 in this scenario, you have a way to go create it.

So discovery of n images meeting your selection criteria is one reason that it's not just one call to the registry. There are others, and it's heavily dependent on what sort of selection criteria you use. If, for instance, your selection criteria were that you wanted the most recently pushed image from your image repo, regardless of how its tagged, Kargo has no choice but to pull down the metadata for, possibly many, many images (often two API calls each) to find date information that is not present in the tag list that's retrieved up front. The number of images Kargo needs to pull metadata for can be pared down up front by other constraints you put on the selection, such as using a regex to narrow down eligible images by tag.

Still another factor is whether you have included any platform constraints in your selection criteria. If, for instance, you know there are (non-multiarch) images in your repo for both amd64 and arm64 architectures and you wish to be certain that you're selecting one or the other and have specified so in your selection criteria, this is another case where Kargo will be forced to pull down additional metadata (again, often two calls each).

What all of this adds up to is: If you're not careful about your selection criteria or set n high, a Warehouse can be very, very chatty with a registry. Even under ideal conditions where n is set low, and your selection criteria allow n to be chosen from the top of the sorted tag list, there are still at least n + 1 calls to the registry because metadata needs to be pulled to get digest information for each tag.

Exacerbating all of this, chatty as it is, it's easy to encounter your registry's rate limit, which can slow things down further, especially if cluster-wide, you've got dozens or hundreds of Warehouses all talking to the same registry.

The bottom line is:

Expect Warehouses to take some time to reconcile, even under ideal conditions.
Heavily favor:
- A less frequent Warehouse reconciliation interval
- A low n
- Selection criteria that do not rely on date or platform metadata
Use a paid plan for your registry, when applicable/possible, to increase your rate limit.

We do have some additional things we're considering to further reduce chattiness with registries in the scenarios that are already the most ideal, but they're going to be opt-in things like swearing a solemn oath (this is tongue-in-cheek) that you never use mutable tags and that it is therefore safe for Kargo to make fewer calls to the registry by ignoring digests.

0 replies

ddeath · 2024-10-29T14:12:14Z

ddeath
Oct 29, 2024
Author

Thank you for detailed explanation. Now it makes more sense.

I think it would be beneficial to have section in docs for this or at least maybe Faq with mention of this. I noticed that ImageSelectionStrategy is not mentioned at all in docs. There is selection strategy mentioned but it is just as side comment. Also CRDs description does not have list of all allowed values for this field, the place where I found it is here:

kargo/ui/src/gen/schema/warehouses.kargo.akuity.io_v1alpha1.json

Line 184 in b2be266

"Digest",

3 replies

krancour Oct 29, 2024
Maintainer

You're right that this needs a more detailed explanation in docs. We are actively working on improving documentation, but fwiw, detailed CRD docs have always been available. There's a link to them at the bottom of the left sidebar.

https://doc.crds.dev/github.com/akuity/kargo

ddeath Oct 29, 2024
Author

Yes but they dont have values for ImageSelectionStrategy. It is there marked as String only

krancour Oct 29, 2024
Maintainer

Yeah... after I linked you to it, I was looking at it and was surprised to see that info was missing.

That field is, in fact, not directly a string. It is defined like this:

// +kubebuilder:validation:Enum={Digest,Lexical,NewestBuild,SemVer}
type ImageSelectionStrategy string

I am surprised to learn that the enum constraint on that type is not showing up in these docs.

Noted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warehouses show in progress for whole time of "deployment"? #2868

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Warehouses show in progress for whole time of "deployment"? #2868

ddeath Oct 29, 2024

Replies: 2 comments · 3 replies

krancour Oct 29, 2024 Maintainer

ddeath Oct 29, 2024 Author

krancour Oct 29, 2024 Maintainer

ddeath Oct 29, 2024 Author

krancour Oct 29, 2024 Maintainer

ddeath
Oct 29, 2024

Replies: 2 comments 3 replies

krancour
Oct 29, 2024
Maintainer

ddeath
Oct 29, 2024
Author

krancour Oct 29, 2024
Maintainer

ddeath Oct 29, 2024
Author

krancour Oct 29, 2024
Maintainer