Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MANIFEST_PULLED internal container and task states #4137

Merged
merged 3 commits into from
Apr 16, 2024

Conversation

amogh09
Copy link
Contributor

@amogh09 amogh09 commented Apr 8, 2024

Summary

This PR adds a new ContainerState named ContainerManifestPulled and a new TaskState called TaskManifestPulled. ContainerManifestPulled is added between ContainerStatusNone and ContainerPulled states and similarly TaskManifestPulled is added between TaskStatusNone and TaskPulled states.

A new state transition function pullContainerManifest is added to DockerTaskEngine for handling transition of a container to ContainerManifestPulled state. The transition function is a no-op currently but will be updated in a future change to pull the manifest of the container image. This is a part of a project to expedite the reporting of container image manifest digests to ECS backend.

Containers that want to reach steady state are allowed to transition to ContainerManifestPulled state under either of the following conditions -

  • If the container image pull depends on ASM (AWS Secrets Manager) auth for private registry then ASM auth resource has been created.
  • If the container image pull depends on task execution role then task execution role credentials have been resolved.
  • All other cases.

Container ordering dependencies are disregarded for transition to ContainerManifestPulled state. This is so that image manifest digests of all images of a task can be reported as soon as possible.

ContainerManifestPulled state maps to TaskManifestPulled state when computing the known status of a task.

None of the new states generate new state change events as of this PR, so container and tasks transitioning to these new states will not result in any new STSC calls.

Implementation details

  • Add ContainerManifestPulled ContainerStatus constant to containerstatus package and update string mappings of ContainerStatus so that ContainerManifestPulled is mapped to "MANIFEST_PULLED" string. Update tests accordingly.
  • Add TaskManifestPulled TaskStatus constant to taskstatus package and update string mappings of TaskStatus so that TaskManifestPulled is mapped to "MANIFEST_PULLED" string. Update tests accordingly.
  • Add a new pullContainerManifest method to DockerTaskEngine type. This method is the transition function for transitioning a container to ContainerManifestPulled state. The method is currently a no-op. Update DockerTaskEngine.containerStatusToTransitionFunction map with a new apicontainerstatus.ContainerManifestPulled: engine.pullContainerManifest entry accordingly.
  • Update managedTask.handleEventError method to handle errors generated in the new state transition function. The error handling logic is the same as that for ContainerPulled state transition errors. This is because we want to treat manifest pulls as a part of image pulls and hence we want to have the same error handling for both.
  • Update Task.initializeASMAuthResource method so that a container's transition to ContainerManifestPulled state depends on the creation of the task's ASM Auth resource that is used to resolve private registry credentials. This is because private registry credentials will be needed for manifest pulls.

Testing

Updated existing unit tests and added new unit tests.

Added a new integration test that checks -

  • Containers reach MANIFEST_PULLED state.
  • Transition to MANIFEST_PULLED state does not depend on container ordering.

Following manual tests were also performed.

Ran a simple task with a single container and observed the following in the logs.

  • The container transitioned to MANIFEST_PULLED state before transitioning to PULLED state as expected.
level=debug time=2024-04-10T16:54:57Z msg="Transitioned container" task="a73d8211344746749c4dd07d5b15ad26" container="sleep" runtimeID="" nextState="MANIFEST_PULLED" error=<nil>
  • Container's state change to MANIFEST_PULLED did not qualify for a state change event as expected.
level=debug time=2024-04-10T16:54:57Z msg="should not send events for internal tasks or containers: create container state change event api: status not recognized by ECS: MANIFEST_PULLED"
  • Task transitioned to MANIFEST_PULLED state after the container transitioned to MANIFEST_PULLED state as expected.
level=info time=2024-04-10T16:54:57Z msg="Container change also resulted in task change" task="a73d8211344746749c4dd07d5b15ad26" container="sleep" runtimeID="" desiredStatus="RUNNING" knownStatus="MANIFEST_PULLED"
  • Task's state transition to MANIFEST_PULLED state did not qualify for a state change event as expected.
level=debug time=2024-04-10T16:54:57Z msg="Skipping event emission for task" knownStatus="MANIFEST_PULLED" task="a73d8211344746749c4dd07d5b15ad26" error="status not recognized by ECS"
  • Task completed its lifecycle as usual. The stop code for the task was EssentialContainerExited and the container stopped with successful exit code 0.

Ran a task with two containers with one container depending on other container's completion. Both containers passed MANIFEST_PULLED state with the dependent container waiting in MANIFEST_PULLED state until the dependency container completed.

Ran a task with one container whose image is in a private Dockerhub registry. The container did not transition to MANIFEST_PULLED state until the task's ASM Auth resource for private registry credentials reached CREATED state which is expected.

New tests cover the changes: yes

Description for the changelog

Does this PR include breaking model changes? If so, Have you added transformation functions?

Enhancement: Add a new MANIFEST_PULLED container and task state. The transition function for this state is currently a no-op.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@amogh09 amogh09 force-pushed the manifest-pulled-state branch 2 times, most recently from 159d1dc to 4a5ca7c Compare April 9, 2024 19:16
@amogh09 amogh09 force-pushed the manifest-pulled-state branch from c742eb3 to 8f2d145 Compare April 9, 2024 21:14
@amogh09 amogh09 marked this pull request as ready for review April 9, 2024 23:10
@amogh09 amogh09 requested a review from a team as a code owner April 9, 2024 23:10
@amogh09 amogh09 changed the title [WIP] Manifest pulled state Add MANIFEST_PULLED internal container and task states Apr 9, 2024
@@ -3682,7 +3685,8 @@ func (task *Task) ToHostResources() map[string]*ecs.Resource {
func (task *Task) HasActiveContainers() bool {
for _, container := range task.Containers {
containerStatus := container.GetKnownStatus()
if containerStatus >= apicontainerstatus.ContainerPulled && containerStatus <= apicontainerstatus.ContainerResourcesProvisioned {
if containerStatus >= apicontainerstatus.ContainerManifestPulled &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this if-statement seems like it would make more sense to be containerStatus > ContainerStatusNone && containerStatus < ContainerStatusStopped

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah with current states both are equivalent but I get your point. It's a bit out of scope of this PR so I'd rather not change that logic here.

@amogh09 amogh09 merged commit 35e10fd into aws:dev Apr 16, 2024
40 checks passed
@harishxr harishxr mentioned this pull request Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants