Issues with "software version consistency" feature #2394

gilad-yadgar · 2024-07-02T04:26:54Z

EDIT: this is related to the "software version consistency" feature launch, see What's New post: https://aws.amazon.com/about-aws/whats-new/2024/07/amazon-ecs-software-version-consistency-containerized-applications/

Summary

since our EC2 upgraded to ecs-agent v1.83.0, images used for containers are with sha digest and not image tag

Description

we started getting different image value for the '{{.Config.Image}}' property using docker inspect in our ECS EC2.
we are getting sha digest as the .Config.Image instead of getting the image tag.
the task definition contains the correct image tag (and not the digest)

we need the image tag since we rely on that custom tag to understand what was deployed. what can be done?

Expected Behavior

we expect to see the image tag used for the container

Observed Behavior

we get image digest used for the container

Environment Details

curl http://localhost:51678/v1/metadata
{"Cluster":"xxxxr","ContainerInstanceArn":"xxx","Version":"Amazon ECS Agent - v1.83.0 (*xxx"}

dg-nvm · 2024-07-02T08:19:16Z

same

scott-vh · 2024-07-02T19:11:52Z

FWIW today I've encountered a production incident after updating to ecs-agent 1.83.0 roughly 2 weeks ago where I saw a subset of our ECS tasks fail to start with:

CannotPullContainerError: failed to tag image '012345678910.dkr.ecr.<region>.amazonaws.com/<repo>:<tag>@sha256:<digest>'

This was a surprising error to see given that the only change on our side that we can attribute to this is our agent version upgrade 🤷 and it feels similar enough to be worth a mention given the digest in the error message.

This seemed to be isolated to a small fraction of our cluster instances (all running 1.83.0) and tasks from the same task revisions yielding the error eventually phased in without intervention.

I've also happened to notice that aws/amazon-ecs-agent#4181 intends to help augment these kinds of errors with some more useful context and made it into agent release 1.84.0 so I'll report back if/when we upgrade and whether or not that yields anything of use 👍

EDIT: didn't touch the 1.84.0 upgrade after seeing this comment

tomdaly · 2024-07-03T09:59:19Z

this has also caused production issues for my org. we use the ImageName value available in the ECS container metadata file at runtime, as we tag our ECR images with the Git commit SHA. this is then used for a variety of things in different services such as sourcing assets, tracking deploys, etc.

since 1.83.0 ImageName is sometimes present as the SHA digest instead of the image ID, which we expected to be within ImageID and not ImageName.

panny-P · 2024-07-04T03:24:44Z

I still found this error on ecs-agent 1.84.0.

mvanholsteijn · 2024-07-05T09:54:33Z

We have production issues with the change too, when the tag is re-used for a new image layer and the old image is deleted.

timdarbydotnet · 2024-07-08T18:29:10Z

I'm also seeing the issue where a newly pushed and tagged "latest" image is being ignored and the agent will only use the older untagged instance. This needs to be fixed ASAP or at least give us a workaround. I'm seeing this behavior on agent 1.83.0. This was not happening on 1.82.1.

turacma · 2024-07-08T22:12:24Z

We are also seeing this issue in our environment. ~~It doesn't seem to happen with all images.~~ FWIW, on the same container instance, we can see some containers with tags and others without and if a container is one with tags, it's the first launched container.

turacma · 2024-07-08T22:35:59Z

FWIW, this also impacts the ECS APIs, specifically describe-tasks

https://www.reddit.com/r/aws/comments/1dtgc4b/mismatching_image_uris_tag_vs_sha256_in_listtasks/

Unclear if the source of truth (and the root cause) is the agent or the APIs themselves, but just though it's worth noting this.

joelcox22 · 2024-07-11T03:49:03Z

Found this issue after internal investigation of an incident that seems likely related to this. If it helps anyone else, here's my analysis of how this impacted a service that was referencing an ECR image based on a persistent image tag that we were regularly rebuilding and overwriting, and had automation in place for deleting the older untagged images

I have an open support case with AWS to confirm this behaviour, and have included a link to this github issue.

sequenceDiagram
participant jenkins as Jenkins
participant cloudformation as Cloudformation
participant ecs-service as ECS Service
participant ec2-instances as EC2 Instances
participant ecr-registry as ECR Registry
participant docker-base-images as Docker Base Images<br />firelens sidecar image
participant ecr-lifecycle-policy as ECR Lifecycle Policy
jenkins ->> cloudformation: regular deployment
cloudformation ->> ecs-service: creates a new "deployment" for the service
activate ecs-service
note right of ecs-service: ECS resolves the image hash<br />at time of "deployment" creation
ecs-service ->> ec2-instances: starts tasks with resolved image hashes
ec2-instances ->> ecr-registry: pulls latest image from ECR
docker-base-images ->> ecr-registry: rebuid and push image regularly
ecr-lifecycle-policy ->> ecr-registry: deletes older images periodically
note right of ecs-service: periodically, new tasks need to start
ecs-service ->> ec2-instances: starts tasks with previously resolved image hashes
ec2-instances ->> ecr-registry: attempts to run the same image hash from earlier<br />if the image already exists on the instance, its fine<br />otherwise, it needs to pull from ECR again and may fail
ec2-instances ->> ecs-service: tasks fail to launch due to missing image
note right of ecs-service: at this point, the service is unstable<br />might have existing running tasks<br /> but it can't launch new ones
create actor incident as Incident responders
ecs-service ->> incident: begin investigation
note left of incident: "didn't this happen the other day<br />for another service?" *checks slack*
note left of incident: Yeah, it did happen, and the outcome<br />was that we disabled the ECR lifecycle<br />policy, but services were left with<br />the potential to fail when tasks cycle
incident ->> jenkins: trigger replay of latest production deployment early and hope that fixes the issue
jenkins ->> cloudformation: deploy
cloudformation ->> incident: "there are no changes in the template"
incident ->> jenkins: disable the sidecar to get the service up and running again quickly and buy more time for investigation
jenkins ->> cloudformation: deploy with sidecar disabled
deactivate ecs-service
cloudformation ->> ecs-service: create new deployment without sidecar
activate ecs-service
note right of ecs-service: no longer cares about firelens sidecar image
ecs-service ->> ec2-instances: starts new tasks
ec2-instances ->> ecs-service: success
ecs-service ->> incident: service is up and running again, everyone is happy
note left of incident: "but we're not done yet"
incident ->> jenkins: re-enable the sidecar
jenkins ->> cloudformation: deploy with sidecar enabled
deactivate ecs-service
cloudformation ->> ecs-service: create new deployment with sidecar
activate ecs-service
note right of ecs-service: ECS resolves the image hash<br />at time of "deployment" creation
ecs-service ->> ec2-instances: start new tasks
ec2-instances ->> ecr-registry: pulls new images with updated hash
ec2-instances ->> ecs-service: success
ecs-service ->> incident: service is stable again
note left of incident: This service looks good again now<br />but other services might still have a problem
deactivate ecs-service
incident ->> ecs-service: work through "Force New Deployment" for all services in all ecs clusters & accounts
note left of incident: all services are now expected to be<br />stable, as everything should be<br />referencing the latest firelens image<br />hash, and the lifecycle policy<br />to delete older ones is disabled

L3n41c · 2024-07-11T14:04:00Z

This issue most probably comes from aws/amazon-ecs-agent#4177 merged in 1.83.0:

Expedited reporting of container image manifest digests to ECS backend. This change makes Agent resolve container image manifest digests for container images prior to image pulls by either calling image registries or inspecting local images depending on the host state and Agent configuration. Resolved digests will be reported to ECS backend using an additional SubmitTaskStateChange API call

sjmisterm · 2024-07-11T14:13:20Z

Downgrading to 1.82.4 in our case does not make the issue go away, indicating that, even if it was related to the agent, the digest information is now somehow cached by ECS. We are currently using a DAEMON ECS service.

According to a recent case opened with AWS support, "ECS now tracks the digest of each image for every service deployment of an ECS service revision. This allows ECS to ensure that for every task used in the service, either in the initial deployment, or later as part of a scale-up operation, the exact same set of container images are used." They added this is part of a rollout that started in the last few days of June and is supposed to complete by Monday.

Their suggested solution is to update the ECS service with "Force new deployment" to "invalidate" the cache. If you have AWS support, try to open a case including this information to see how they evaluate your issue.

joelcox22 · 2024-07-11T23:21:20Z

I got a similar response to @sjmisterm in my support case, confirming the new behaviour is expected, and stating that we should no longer delete the images from ECR until we're certain that the images are no longer in use by any deployment.

This change effectively means ECR lifecycle policies to delete untagged images are expected to cause outages unless additional steps are taken immediately after every time an image is deleted to ensure every deployment that was referencing a mutable tag is redeployed. This is particularly problemattic for my specific use-case where we were referencing a mutable tag for a sidecar container that we include for many services.

I've asked if there is any future roadmap plans to make this use-case easier to manage, and requested for a comment from AWS on this github issue 😄

... https://xkcd.com/1172/

sjmisterm · 2024-07-12T11:22:42Z

AWS has confirmed this is definitely caused by them and they think this is a good feature, as the links (made available yesterday) show

https://aws.amazon.com/about-aws/whats-new/2024/07/amazon-ecs-software-version-consistency-containerized-applications/
https://aws.amazon.com/blogs/containers/announcing-software-version-consistency-for-amazon-ecs-services/
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-type-ecs.html#deployment-container-image-stability

There's no way to turn off this new behaviour, which completely breaks the easiest workflow for blue-green deployments - I'm sure tons of people have other cases that need or benefit from the old one.

I suggest all who have AWS support to file a case and to request an API to turn this off by service / cluster / account.

amogh09 · 2024-07-12T15:55:35Z

Hello. I am from AWS ECS Agent team.

As shared by @sjmisterm above, the behavior change that customers are seeing is because of the recently released Software Version Consistency feature. The feature guarantees that same images are used for a service deployment by recording image manifest digests reported by the first launched task and then overriding tags with digests for all subsequent tasks of the service deployment.

Currently there is no way to turn off this feature. ECS Agent v1.83.0 included a change to expedite the reporting of image manifest digests but older Agent versions also report digests and ECS backend will override tags with digests in both cases. We are actively working on solutions to fix the regressions our customers are facing due to this feature.

amogh09 · 2024-07-12T16:11:36Z

One of the patches we are considering is - instead of overriding :tag with @sha256:digest, we would override it with :tag@sha256:digest so that the lost information is added back to the image references.

sjmisterm · 2024-07-12T16:13:27Z

@amogh09 , I can't see how this would address the blue-green scenario. Could you explain it, please?

amogh09 · 2024-07-12T16:13:34Z

There's no way to turn off this new behaviour, which completely breaks the easiest workflow for blue-green deployments

@sjmisterm Can you please share more details on how this change is breaking blue-green deployments for you?

sjmisterm · 2024-07-12T16:17:49Z

@amogh09 , sure.

Our blue-green deployments work by deploying a new image to the ECR repo tagged with latest and then launching a new EC2 instance (from the ECS-optimized image, properly configured for the cluster) while we make sure the new version works as expected in production. Then, we start to progressively drain the old tasks until only new tasks are available.

sjmisterm · 2024-07-12T16:18:40Z

@amogh09 in summary: the software version "inconsistency" is what makes blue green a breeze with ECS. Should we want consistency, we'd use a digest or a version tag.

amogh09 · 2024-07-12T16:40:10Z

@sjmisterm Deployment unit for an ECS service is a TaskSet. The software version consistency feature guarantees image consistency at TaskSet level. In your case, how do you get a new task to be placed to the new EC2 instance? The new task needs to be a part of a new TaskSet to get the newer image version. If it belongs to the existing TaskSet then it will use the same image version as its TaskSet.

ECS supports blue-green deployments natively at service level if the service is behind an Application Load Balancer. You can also use External deployment type for an even greater control over the deployment process. Software Version Consistency feature is compatible with both of these.

timdarbydotnet · 2024-07-12T17:32:52Z

@amogh09 I use a network load balancer and the LDAP container instances I'm running will not respond well to this new model. If I can't maintain the ability to pull the tagged latest image, I will have to stop using ECS and manage my own EC2s, which would be painful frankly.

Looking at the ECS API, what would happen if I called DeregisterTaskDefinition and then RegisterTaskDefinition. Would that have the effect of forcing ECS to resolve the digest from the new latest image without killing the running tasks?

sjmisterm · 2024-07-12T17:34:07Z

@amogh09 , I think we're talking about different things. Until the ECS change, launching properly a new ECS instance properly configured for a ECS daemon service whose taskdef is tagged with :latest would launch the new task with, well, the image tagged latest. Now it launches it using the digest resolved by the first task unless you force a new deployment in your service.

Our deployment scripts pre-dates CodeDeploy and the other features. So all your suggestions require rewriting deployment code because of a feature we can't simply opt-out.

amogh09 · 2024-07-12T18:29:27Z

I understand the frustrations you all are sharing regarding this change. I request you to contact AWS Support for your issues. Our support team will be able to assist you with workarounds relevant to your specific setups.

sjmisterm · 2024-07-12T18:32:11Z

@amogh09 , a simple API flag in the service / cluster / region / account would solve the problem. That's what we're trying to get across because it disturbs your customer base - not everyone pays support and the old behaviour, as you can see, is used by several of them.

mpoindexter · 2024-07-12T18:39:43Z

I'll chime in that we were negatively impacted by this change as well, and I don't think it helps anything for most scenarios.

Before, customers effectively had a choice: they could either enforce software version consistency by using immutable tags (https://docs.aws.amazon.com/AmazonECR/latest/userguide/image-tag-mutability.html), or if they wanted to allow for a rolling release (most useful for daemon services as @sjmisterm alluded to) they could achieve that as well by using a mutable tag.

Now, this option is gone with nothing to replace it, and very poor notification that it was going to happen to boot.

timdarbydotnet · 2024-07-12T19:01:43Z

I'm very disappointed with AWS on two counts:

From a technical standpoint, it appears that no consideration was given to how customers are actually using ECS.
The lack of prior communication for a change like this is shocking. I've seen AWS announce long lead times for far less impactful changes than this.

scott-vh · 2024-07-16T19:02:36Z

I know that the circumstances around how we all got notified about this change aren't ideal, but is there anywhere where we can be proactive and follow along for similar updates that may affect us in the future? Did folks get a mention from their AWS technical account managers or similar?

I lurk around the containers roadmap fairly often, but don't see an issue/mention there or in any other publicly-facing aws github project around this feature release.

dg-nvm · 2024-07-16T20:43:49Z

@scott-vh the problem is that this is an internal API change, ECS backend behaves differently now. This has nothing to do with ecs-agent itself, regardless of version you will get same behaviour. Noone could see it coming

scott-vh · 2024-07-16T20:56:10Z

@dg-nvm Yep I got that 👍 I was just curious if there was any breadcrumb anywhere else for which we could've seen this coming (sounds like no, but wanted to see if anyone who interfaces with TAMs or ECS engineers through higher tiers of support got some notice)

dg-nvm · 2024-07-16T21:03:01Z

@scott-vh our TAM was informed about the problem but idk if there was any proposal. Given that I see ideas for workarounds accumulating I would say no :D Luckily our CD was not impacted by this, I can think of scenario that deamons deployments is easier using mutable tags, especially that ECS does not play nicely when replacing daemons. Sometimes they are stuck because they got removed from the host and something else was put in it's place in the meantime :)

sparrc · 2024-07-19T21:44:41Z

Hi all, I have transferred this issue into the containers-roadmap repo. As far as I understand it, people are experiencing issues with this feature as a whole, rather than an issue with the ECS agent behavior specifically. For reference, see what's new post: https://aws.amazon.com/about-aws/whats-new/2024/07/amazon-ecs-software-version-consistency-containerized-applications/

Please feel free to continue adding your +1 and providing feedback :)

simonulbrich · 2024-07-22T16:02:14Z

This issue is affecting us as well. we utilize an initialization container that runs before the app container. this init container sets up monitoring integrations and settings that are not critical to the app itself but with a limited team we rely on the mutable tags to handle the "rolling" update as tasks are restarted. to force an application deployment for each application that my team manages for these small config updates would be an impossible task. Is there any way at all to prevent this "consistency" feature for a single container or disable entirely at the task level?

It seems like this feature was already solved with tag immutability, giving us the option to use mutable tags if we actually needed that behavior.

acdha · 2024-08-01T14:01:24Z

This regression caused a minor production outage for us because AWS' monitoring tools like X-Ray recommend using mutable tags, which means that if any of those has a release outside of your deployment cycle you are now set up to have all future tasks fail to deploy because you followed the AWS documentation:

I think this feature was a mistake and should be reverted – there are better ways to accomplish that goal which do not break previously-stable systems and immutable tags are not suitable for every situation, as evidenced by the way the AWS teams above are using them – but if the goal is to get rid of mutable tags it should follow a responsible deprecation cycle with customer notification, warnings when creating new task definitions, some period where new clusters cannot be created with the option to use mutable tags in tasks, etc. since this is a disruptive change which breaks systems which have been stable for years and there isn't a justification to break compatibility so rapidly.

vat-gatepost-BARQUE · 2024-08-01T15:10:30Z

We are also having an issue with this. Our development environment is setup to have all the services on a certain tag that keeps us from having to redeploy. They can simply stop the service and it comes back up with the most current image with that tag. Now they are having to update the service which is more steps than needed. This also seems to be a problem with our lambdas that spin up Fargate tasks and those tasks are not the most current version of the tag now. The update service is not an option on these so we are still trying to work that out.

mvanholsteijn · 2024-08-01T19:14:43Z

The strangest thing is that the feature was already available for those who wanted this. You can specify the container image with digest and that would pin the image explicitly. No code changes required to ECS.

floating potentially inconsistent -> my-cool-image:latest
fixed -> my-cool-image:latest@sha256:fdcfbed7e0a7beb8738e00fe8961c8e33e17bdeee94eab52cb8b85de1d04d024

tomkins · 2024-08-02T15:37:12Z

Also had an issue with one of our sites which I believe is related to this - a container pulling from an ECR repository with a lifecycle policy, EC2 instance restarts - and ECS wants to pull the non-existant old image as there hasn't been a fresh deploy of a container for weeks.

The version consistency is a fantastic feature, but there are situations where I want the tag to be used rather than the image digest at last deploy.

vibhav-ag · 2024-08-07T17:59:59Z

Sorry for the late response on this thread- we're aware of the impact this change has had and apologize for the churn this rollout has created. We've been actively working through the set of issues that have been highlighted on this thread and have 2 updates to share: 1/for customers who've been impacted by the lack of ability to see image tag information, we're working on a change that will bring back image tag information in the describe-tasks response, in the same format as was available prior to the release of version consistency (i.e image:tag). An important thing to keep in mind here is that if you run docker ps on the host, you will see the image in format image:tag but docker inspect will return image:tag@digest. 2/ We're also working on adding a configuration in the container definition that will allow you to opt-out of digest resolution for specific containers within your task- this should address both customers who want to completely opt out of digest resolution as well as customers who want to disable resolution for specific sidecar containers. I'll be using this issue to share updates on the change to bring back image tag information in describe-tasks and use issue #2393 for the change to disable digest resolution for specific containers. We're tracking both changes at high-priority.

Once again, we regret the churn this change has caused you all. While we still believe version consistency is the right behavior for the majority of applications hosted on ECS, we fully acknowledge that we could have done a better job socializing these changes and addressing these issues before, rather than after making the change.

nitrotm · 2024-08-20T00:56:33Z

I can concur that this "software version consistency" change to ECS render the concept of services totally useless for us. We may have to fallback to manually deployed task (without services) but then we'll loose the watchdog aspects which we have to re-implement ourselves.

In short, we need to guarantee a few properties on our services running background jobs:

a task cannot be stopped automatically within a deterministic time-frame. Therefore, we internally flag the task to stop accepting new jobs and letting it complete its currently assigned job. Only then, when the task is idle, we stop it and we relied on a nice property of ECS that it would automatically fetch that last container image associated with the tag (eg. 'latest').
some of our services need to have at least N tasks up and running at all time (guaranteeing some kind of always-on property)
some services are dynamically adjusted via auto-scaling groups due to the highly changing nature of the demand

These combined with the new constraint that all of the tasks within a service need to have the same image digest, means that we cannot roll out any update to our containers without breaking at least one property.

Tbh this feels like we may want to switch to a plain k8s solution were we can setup and manage our workloads with some degree of flexibility. Hopefully an opt-out solution will be available soon as mentioned above, but we are stuck with our deployments atm and need a solution asap.

peterjroberts · 2024-08-22T23:02:35Z

The forced addition of this feature also caused a significant production incident for us. We deliberately used mutable tags as part of our deployments, and an ECR lifecycle policy to remove the old untagged images after a period.

This should absolutely have been an opt-in feature, or opt-out but disabled for existing services. I'm glad to see that's now been identified and raised but should this feature not be reverted until that option is available? To prevent everyone affected having to redesign workflows or implement workarounds.

As has been pointed out by others, those that want consistency by container digest can already achieve that through either tag immutability, or referring to the digest explicitly in the task definition.

ollypom · 2024-09-19T12:54:47Z

A quick update on @vibhav-ag's post. We have now completed the first action in his comment.

Amazon ECS no longer modifies the containerImage field found in the DescribeTask or Task Metadata output. As part of the initial ECS software version consistency release the container image tag imageUri:tag was replaced with the container image digest imageUri@digest, this is no longer the case.

danbf · 2024-10-10T18:32:00Z

This almost knocked down our production environment, it did knock down stage, because we had been treating our ECR images as mutable and since we were running some services on FarGate our task instances were rotated in the background. This prevented those FarGate hosted services from re-provisioning new task instances.

Turns out to be not hard for us to switch to only immutable ECR images, and we had been taking all public images and enforcing in our services to only pull from a pinned copy in our ECR repos, so we were not exposed to the issues with using a latest tag, which we basically don't even do internally.

But wow, this breaking change hit us out of left field, and it should probably have been listed as Breaking on the changelog.

We got hit during a FarGate platform maintenance event, so we ask, is there some way for AWS to mitigate similar issues for FarGate services, like waiting for the newer container tasks to be running before terminating the existing container tasks. More like a proper rolling update.

gregtws · 2024-10-29T22:14:53Z

This has impacted us as well. We use the equivalent of a mutable 'latest' tag and perform rolling service upgrades when we move the 'latest' tag. This lets us slowly do blue/green deployments (as our service can be told to recycle itself over time).

Instead we weren't actually progressing our blue/green deployment as AWS kept deploying the old revision of the service rather than the one pointed to by our mutable tag. Even replacing the EC2 instance didn't fix it. Only re-running the service deployment has done the trick. This is a massive behavior change and shouldn't have ever been released without opting into the change.

acdha · 2024-10-30T21:55:56Z

We got hit during a FarGate platform maintenance event, so we ask, is there some way for AWS to mitigate similar issues for FarGate services, like waiting for the newer container tasks to be running before terminating the existing container tasks. More like a proper rolling update.

Fargate normally does that health-based deployment but that won't help you if the old containers can't continue running due to a failure in the container or host. That's one of the reasons why this mistake was so dangerous is that unless you monitor the ECS service events you will have a service which is working normally until a previously-recoverable error occurs and then you learn that the ECS team broke your deployment in July when something is completely down.

What I ended up with is an Event Bridge rule which listens to the ECR Image Action for our source repositories and a Lambda listener which creates a new ECS service deployment to ensure that there's never a situation where our ECR tags are updated but ECS is now looking for the old version (we use environment-tracking branches & tags so the latest version is something like “testing” or “staging”). That isn't enough to avoid problems with Amazon's own containers, however, so our deployment pipeline now does a lookup for CloudWatch and X-Ray to get the current versioned tag which latest currently corresponds to so it can use that instead.

rs-garrick · 2024-11-07T00:28:11Z

Just to add another use-case...

I have an app with very long-running background processes. This app is not deployed with --force-new-deployment because of other issues with ECS deploys related to long-running processes that take days to exit.

Instead, all instances are marked DRAINING so that new instances are created with the updated container image. Because the service revision is never updated with the new sha, the new instances pull down an old container image.

Oddly, there's no way to update the service revision with the new sha without triggering an actual deploy. This is the missing piece for me.

I need a way to update the sha stored in the service revision without triggering a deployment. Something like aws ecs update-service --new-service-revision or --update-container-sha (instead of --force-new-deployment) would be fine.

$ aws ecs describe-service-revisions --service-revision-arns arn:aws:ecs...
...
            "containerImages": [
                {
                    "containerName": "foo",
                    "imageDigest": "sha256:483...76d",   <-- let me update this!!
                    "image": "3622.../foo:prd"
                }
            ],

isker · 2024-11-18T17:38:00Z

2/ We're also working on adding a configuration in the container definition that will allow you to opt-out of digest resolution for specific containers within your task- this should address both customers who want to completely opt out of digest resolution as well as customers who want to disable resolution for specific sidecar containers.

If all containers in the task are opted out, will this remove the latency impact associated with this feature as well? I have a service that's updated with great frequency and whose deployments are latency sensitive. Two of the three containers in this service are already deployed with digests, but the third uses a tag because its image is built/published by CDK, and it's not possible to get access to the digest of such images to use in task definitions. So I believe we are stuck with the latency impact of this feature for mostly no reason: the image tags produced by CDK already approximate the behavior of digests, in that the image should not change if the tag is not changing.

I'm specifically referring to this line from the documentation:

To avoid potential latency altogether, specify container image digests in your task definition.

vibhav-ag · 2024-11-19T22:23:55Z

Update 2: you now have the ability to disable consistency for specific containers in your task by configuring the new versionConsistency field for each container in the task definition. Any changes to this property are applied after a deployment. Once again, we regret the churn this change has caused you all.

What’s New Post

vibhav-ag · 2024-11-19T22:25:02Z

2/ We're also working on adding a configuration in the container definition that will allow you to opt-out of digest resolution for specific containers within your task- this should address both customers who want to completely opt out of digest resolution as well as customers who want to disable resolution for specific sidecar containers.

If all containers in the task are opted out, will this remove the latency impact associated with this feature as well? I have a service that's updated with great frequency and whose deployments are latency sensitive. Two of the three containers in this service are already deployed with digests, but the third uses a tag because its image is built/published by CDK, and it's not possible to get access to the digest of such images to use in task definitions. So I believe we are stuck with the latency impact of this feature for mostly no reason: the image tags produced by CDK already approximate the behavior of digests, in that the image should not change if the tag is not changing.

I'm specifically referring to this line from the documentation:

To avoid potential latency altogether, specify container image digests in your task definition.

Yes, if you opt out every container, you will see no impact to deployment latency because of digest resolution.

vibhav-ag added ECS Amazon Elastic Container Service Work in Progress labels Aug 7, 2024

vibhav-ag mentioned this issue Aug 7, 2024

[ECS] [Container image resolution]: Allow feature to be disabled (or make it opt-in) #2393

Closed

vibhav-ag added this to containers-roadmap Oct 21, 2024

github-project-automation bot moved this to Researching in containers-roadmap Oct 21, 2024

vibhav-ag added the Coming Soon label Oct 21, 2024

vibhav-ag moved this from Researching to Coming Soon in containers-roadmap Oct 21, 2024

jenmlinaws removed the Work in Progress label Nov 11, 2024

vibhav-ag closed this as completed Nov 19, 2024

github-project-automation bot moved this from Coming Soon to Shipped in containers-roadmap Nov 19, 2024

vibhav-ag added Shipped This feature request was delivered. and removed Coming Soon labels Nov 19, 2024

isker mentioned this issue Nov 19, 2024

ecs: support new versionConsistency on container definitions aws/aws-cdk#32202

Open

2 tasks

Issues with "software version consistency" feature #2394

Issues with "software version consistency" feature #2394

Comments

gilad-yadgar commented Jul 2, 2024 • edited by sparrc Loading

Summary

Description

Expected Behavior

Observed Behavior

Environment Details

dg-nvm commented Jul 2, 2024

scott-vh commented Jul 2, 2024 • edited Loading

tomdaly commented Jul 3, 2024 • edited Loading

panny-P commented Jul 4, 2024

mvanholsteijn commented Jul 5, 2024 • edited Loading

timdarbydotnet commented Jul 8, 2024 • edited Loading

turacma commented Jul 8, 2024 • edited Loading

turacma commented Jul 8, 2024

joelcox22 commented Jul 11, 2024

L3n41c commented Jul 11, 2024

sjmisterm commented Jul 11, 2024

joelcox22 commented Jul 11, 2024 • edited Loading

sjmisterm commented Jul 12, 2024

amogh09 commented Jul 12, 2024

amogh09 commented Jul 12, 2024

sjmisterm commented Jul 12, 2024

amogh09 commented Jul 12, 2024

sjmisterm commented Jul 12, 2024

sjmisterm commented Jul 12, 2024

amogh09 commented Jul 12, 2024 • edited Loading

timdarbydotnet commented Jul 12, 2024 • edited Loading

sjmisterm commented Jul 12, 2024

amogh09 commented Jul 12, 2024

sjmisterm commented Jul 12, 2024

mpoindexter commented Jul 12, 2024

timdarbydotnet commented Jul 12, 2024

scott-vh commented Jul 16, 2024

dg-nvm commented Jul 16, 2024

scott-vh commented Jul 16, 2024

dg-nvm commented Jul 16, 2024

sparrc commented Jul 19, 2024

simonulbrich commented Jul 22, 2024

acdha commented Aug 1, 2024 • edited Loading

vat-gatepost-BARQUE commented Aug 1, 2024

mvanholsteijn commented Aug 1, 2024

tomkins commented Aug 2, 2024

vibhav-ag commented Aug 7, 2024

nitrotm commented Aug 20, 2024 • edited Loading

peterjroberts commented Aug 22, 2024

ollypom commented Sep 19, 2024

danbf commented Oct 10, 2024 • edited Loading

gregtws commented Oct 29, 2024

acdha commented Oct 30, 2024

rs-garrick commented Nov 7, 2024

isker commented Nov 18, 2024

vibhav-ag commented Nov 19, 2024 • edited Loading

vibhav-ag commented Nov 19, 2024

gilad-yadgar commented Jul 2, 2024 •

edited by sparrc

Loading

scott-vh commented Jul 2, 2024 •

edited

Loading

tomdaly commented Jul 3, 2024 •

edited

Loading

mvanholsteijn commented Jul 5, 2024 •

edited

Loading

timdarbydotnet commented Jul 8, 2024 •

edited

Loading

turacma commented Jul 8, 2024 •

edited

Loading

joelcox22 commented Jul 11, 2024 •

edited

Loading

amogh09 commented Jul 12, 2024 •

edited

Loading

timdarbydotnet commented Jul 12, 2024 •

edited

Loading

acdha commented Aug 1, 2024 •

edited

Loading

nitrotm commented Aug 20, 2024 •

edited

Loading

danbf commented Oct 10, 2024 •

edited

Loading

vibhav-ag commented Nov 19, 2024 •

edited

Loading