-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECS Milestone 2 Feedback #5
Comments
Thanks for that feedback @PeteGoo. You have raised a number of good points. The process you have outlined sounds similar to what we are proposing. I’ll add comments to each point to highlight where we differ. I’d love to know if the points of difference would be a prevent you from using the new step if you (or others that have already implemented a workflow like yours) wanted to migrate:
Although it wasn’t called out in the RFC, we do foresee that anyone using tools like Terraform would use ignore_changes to gracefully handle external modification of the task definition and service. I expect we’ll have to document this more explicitly though.
This is one point where the proposed step differs. Because we don’t know how the task definition was originally created (Terraform, manually though the AWS console etc), the proposed step will copy the latest task definition and create a new revision with the image versions updated via SDK calls.
This will be the same for the proposed step.
We’ll differ here by copying the latest task revision and updating the image with an SDK call.
This is a good point. Existing AWS steps in Octopus allow roles to be assumed, and we’ll need to expose this functionality in the new ECS targets.
The proposed step will also create a new revision of the task definition, but via SDK calls
Like above, the proposed step will do this via SDK calls.
This is a good point. We haven’t explicitly provided any options to wait for the deployment to complete. Apart from assuming roles and waiting for the deployment to complete, it sounds like the proposed step will differ from the process you outlined by using SDK calls to copy a task revision, update it with new image versions, and then use SDK calls to update the service instead of using Terraform. If the original Terraform template was set to ignore changes to task definition image versions and service task definition revisions, would the proposed step allow you to create your infrastructure with Terraform and update it with Octopus?
We have put some thought into how these advanced deployments might work (although that functionality has been called out of scope for milestone 2). Our current thinking is to use an external deployment controller and use task sets (we have a blog post describing this process at https://octopus.com/blog/ecs-canary-deployments). I’d love to know the details of how you implemented rolling deployments if you’d be willing to share them?
This is a good point. Updating image versions was an obvious choice to automate as part of a CI/CD pipeline, but it sounds like we should consider updating environment variables as well.
To be honest swapping load balancers wasn’t a use case we had considered. Are you able to share a situation where a new load balancer was selected during a deployment? EDIT: I realized this might have been a miscommunication. The ability to define load balancers was going to be added to the step being delivered as part of milestone 1 (where we take full ownership of the CloudFormation template). We won't be updating load balancers with the new steps targeting the updating of task definitions and service. For us, the points to consider are:
|
|
Hi @pete-may-bam,
We expect the So the process will be:
Does that help explain how the deployment process would work? |
Yes it does. It sounds great! |
Reading one of the other posts did make me think of something: getting environment variables into the containers. That would have to be built into the step somehow. I'm picturing a similar mechanism to the one used for the We currently solve that by iterating over a set of variables named like this: We have a script step that gathers those environment variable key value pairs into an environment block that is injected into the container definition via an output variable and variable substitution into the terraform script. |
I agree, we overlooked the fact that both an image version and environment variables are likely to be updated with any deployment. The first steps being delivered as part of milestone 1 will support setting environment variables, and we'll likely also include the ability to define env vars with the step developed for milestone 2. |
Not quite. From what I can tell, it doesn't allow me to make non-image related changes to the task definition with Terraform without things getting out of sync. Terraform won't be aware of the new revisions Octopus is creating, so if I update CPU resources in the Task Definition within Terraform, for example, it could revert me back to a previous image as the new revisions Octopus has made aren't accounted for by Terraform. Currently, to solve for this, our Terraform configuration maintains a "template" task definition that is never directly used by the ECS Service. Our Octopus deploy script step then copies the latest version of the template task definition as a new revision on the actual task definition that is used by the service. This still isn't ideal, as we don't get the changes made in Terraform until Octopus next deploys the app, but it prevents Octopus and Terraform from stepping on each other's toes. Most of this is just a result of the strange design of ECS task definitions, and it is hard to work around.
|
That is a good point. Digging into the Terraform provider, it looks like This has been discussed in a long running thread (which I see you have contributed to). One of the more recent posts implements a similar solutions to yours with a "template" task definition. Overall though the consensus appears to be that the ECS provider doesn't quite expose the required flexibility. As a middle ground (and this is me "thinking out loud", as I'm sure you have considered this), the example shown below creates a task definition where Terraform will create the initial container settings, and then ignore subsequent container updates. This allows Terraform to be in control of creating the initial task definition, and then updating any settings outside of
Of all the proposed solutions I could see in that thread like the example above, modifying state directly, using external data, and use a template, it seems like the template approach is probably the most reliable. This is likely easy to incorporate as an option into milestone 2. We could display an option to create a new task definition revision based on the latest one, or create a new task definition revision based on a second task definition. The logic behind the scenes is going to be much the same. @egorpavlikhin Do you have any thoughts on adding this feature to milestone 2? I'm thinking something like this: and
This has been discussed in the team recently, and we proposed adding the following options to the ECS deployment:
I imagine a roll back scenario would use option 3 where the step would fail after a certain time. We wouldn't be prescriptive about the solution in this scenario, but some examples could be a subsequent step in Octopus executed on failure to revert the service to the previous task definition, or a alert could be sent to Slack or email for a manual resolution. If the step could time out and Octopus could run subsequent steps to remediate the deployment, would that allow you to implement the kind of rollback you had in mind? |
Unfortunately the problem is at a higher level than just the container_definitions attribute. Because the container_definitions are ultimately just a subset of the task definition revision, and the fact that task definition revisions are immutable, an
I think this would suit our needs. As far as I can see, this template copying approach is the only way to fully manage the task definition with Terraform but also update the image (and ideally env variables) on each deployment). If the proposed step supports this, I believe we would be able to adopt it.
Yeah, this would be an improvement over our current solution's behavior. Thanks! |
Thanks all for the replies so far. Here is a summary so far:
|
Sorry, I got distracted. Here's the follow ups to my original post about our solution.
Pretty close. I think the env vars is important as is being able to add new sidecars etc.
Rolling the ECS cluster was relatively straight forward. We created a powershell script to do the following:
We have a minimum healthy percentage of 100% and maximum healthy percentage of e.g. 200%. for replica services. Along with the waiting for ECS deployments to fully complete, you can now set an expectation in your containers that we verify as much as possible on container start e.g. config. That way you fail early and deployments are reliable.
|
I see a lot of responses regarding Terraform, but what about those of us using CDK? CDK has no ability to ignore changes to a task definition. In our case we're using the ApplicationLoadBalancedFargateService pattern, which is what most new AWS users will be recommended if your business involves an AWS Solution Architect. Looking at my options, it seems like that passing in the image version as an environment variable and doing a force deployment might be the best option for us, ignoring this milestone entirely. |
Indeed. The underlying CloudFormation template generated by the CDK will interfere when we make out-of-band changes to the underlying service (to point to the new task definition). We're currently investigating to see if there might be a way around this, but I'm not optimistic. My impression is that the template will need to be run as part of the deployment process by Octopus (and all executions of the template must be run this way). This would allow Octopus the opportunity to inspect the template and make changes as required. Alternatively, Octopus could pass the required parameters to the generator of the template. I'd love to hear more about your use of CDK. |
The "Update Amazon ECS Service" step is now available in Octopus from version 2021.3.9061. It's available for download for self-hosted customers or on Octopus Cloud. The key aspects of this milestone delivered are:
We don't yet have support for using EC2 instance roles or assuming IAM roles when working with our ECS cluster target, but we are planning to add this support in the coming months. Documentation for the step can be found here: https://octopus.com/docs/deployments/aws/ecs-update-service If you'd like to chat about the step or how you use ECS, I have a Calendly page where you can book some time in my calendar: https://calendly.com/rhysparry Alternatively, feel free to comment on this issue. I will close it out next week. |
Just to provide an update if any CDK users stumble upon this thread, we ended up implementing our CDK deployments in Octo back in December which went well. Our github actions pipeline builds/versions the image, pushes it github packages, then we package up CDK and push it to Octopus. Once that's complete an Octopus release is created with the relevant version number, and CDK is extracted into our worker image. Within CDK we added a contextual variable that allows the image version to be passed in via command line: const containerVersion = this.node.tryGetContext('containerVersion') as string | undefined
if(!containerVersion) {
throw Error("Container version must be declared to deploy this stack. Set it with --context containerVersion=x.x.x")
} CDK kicks off and handles the complete task definition update + deployment of the image. |
Thanks for the info @Hawxy I'm glad to hear you found a solution. I've recently been learning about the CDK and used environment variables to achieve the desired result. My script step essentially looked like: export TENTACLE_REPOSITORY=$(get_octopusvariable "Octopus.Action.Package[tentacle].PackageId")
export TENTACLE_VERSION=$(get_octopusvariable "Octopus.Action.Package[tentacle].PackageVersion")
cdk deploy --require-approval never --progress events --no-color 2>&1 I used a package reference to specify the package, making the version (tag) configurable when creating the release. |
Please add a comment to this issue with your feedback regarding the proposed ECS integration described here.
This milestone builds on the work from milestone 1.
We have a few general areas that we are actively seeking feedback on, listed below. But any feedback is welcome.
The text was updated successfully, but these errors were encountered: