Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: more robust Reusable containers experience #2768

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

mdelapenya
Copy link
Member

@mdelapenya mdelapenya commented Sep 4, 2024

What does this PR do?

This PR modifies the way the Reuse mode is implemented in the project, taking tc-java as the implementation reference. For that:

  • instead of passing Reuse plus the container name, we don't need to pass a container name anymore, only the Reuse field. This is a breaking change in terms of semantics.
  • we are calculating a hash of the container request just before the creation of the container.
  • we use github.com/mitchellh/hashstructure/v2 for the hash of the container request, with the following exceptions:
    • functions in the struct are ignored (e.g. config modifiers). Please search for hash:"ignore" to identify them all.
    • if the Files field contains a reference to a directory in the host, it won't participate in the calculation of the hash, as we don't want to traverse the directory getting the hash for the directory tree.
    • for the real files in the Files field, we'll read their bytes and use them get a hash, which will be added to the struct hash.
  • for containers marked for reuse, two labels will be added:
    • org.testcontainers.hash: the hash of the container request.
    • org.testcontainers.copied_files.hash: the hash of the files copied to the container using the Files field in the container request.
  • those two labels will be used to check if a container exists in the docker daemon: the code will do a backoff search up-to 5 seconds (not configurable, maybe desired?), and if there is no container, it will create a new one. Else, the container will be reused. This backoff search is needed because we want to support the use case where multiple and different test processes start the same container, so they can "reuse" it.
  • the SessionID label of the container will be removed from the container request before the container is created, to not tell Ryuk this container must be terminated alongside the test session. Therefore, the container persists across test sessions.
  • the ReuseOrCreateContainer from the provider struct has been deprecated for removal in the upcoming releases.
  • the docs for Reuse has been updated, trying to make it clear the expectations around Reuse.
  • the tests have been updated too, to reflect the new state for Reuse.
  • modules are receving a functional option for marking module containers as reusable.

Why is it important?

Create a more consistent experience around reuse, based on the container state and not forcing the user to add a container name, that can be reused by mistake in a totally different container request.

We expect this change helps the community, but at the same time warn about its usage in parallel executions, as it could be the case that two concurrent test sessions get to the container creation at the same time, which could lead to the creation of two containers with the same request.

Related issues

Docs

Render URLs:

Follow-ups

Please try out this branch, and read the docs so that they can be improved if needed 🙏

@mdelapenya mdelapenya requested a review from a team as a code owner September 4, 2024 16:20
@mdelapenya mdelapenya added the breaking change Causing compatibility issues. label Sep 4, 2024
@mdelapenya mdelapenya self-assigned this Sep 4, 2024
Copy link

netlify bot commented Sep 4, 2024

Deploy Preview for testcontainers-go ready!

Name Link
🔨 Latest commit 799ee15
🔍 Latest deploy log https://app.netlify.com/sites/testcontainers-go/deploys/66d9710ef06f0c0007ef30bf
😎 Deploy Preview https://deploy-preview-2768--testcontainers-go.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mdelapenya
Copy link
Member Author

@stevenh as always, your wise 👀 are more than welcome!

// can proceed with the creation of the container.
// This is needed because we need to synchronize the creation of the container across
// different test programs.
c, err := p.waitContainerCreationInTimeout(ctx, hash, 5*time.Second)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this as a potential configuration:

  • a property: testcontainers.reuse.search.timeout, and
  • an env var: TESTCONTAINERS_REUSE_SEARCH_TIMEOUT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the PR description says:

We expect this change helps the community, but at the same time warn about its usage in parallel executions, as it could be the case that two concurrent test sessions get to the container creation at the same time, which could lead to the creation of two containers with the same request.

wouldn't we want to accept this limitation and allow for the race condition with the current implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Added that comment as an idea for a potential follow-up

@aaomidi
Copy link

aaomidi commented Sep 5, 2024

I wonder if this solves this bug: #2749

I'll do a small code review here as well. Thank you for this!

Copy link
Collaborator

@stevenh stevenh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done an initial pass

err = req.creatingHook(ctx)
if err != nil {
return nil, err
if req.Reuse {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Not sure about this, I would expect reused containers to still only persist while in use, is that the intent?

The edge case for shutdown while still in use, is addressed by reaper rework PRs, so it should be safe to remove this if the desired behaviour is to share between test runs that are within a reasonable window.

Comment on lines +1135 to +1136
reuseContainerMx.Lock()
defer reuseContainerMx.Unlock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: This will lock for the duration of rest of the call, is that the intention?

Comment on lines +1140 to +1141
req.Labels[core.LabelContainerHash] = fmt.Sprintf("%d", hash.Hash)
req.Labels[core.LabelCopiedFilesHash] = fmt.Sprintf("%d", hash.FilesHash)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: do we need multiple hashes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is aligned with the Java implementation. We need these two in order to keep track of the added files before the container is started (Files attribute in the container request)

c, err := p.waitContainerCreationInTimeout(ctx, hash, 5*time.Second)
if err != nil && !errdefs.IsNotFound(err) {
// another error occurred different from not found, so we return the error
return nil, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Can we wrap so we know where it came from

Comment on lines +1129 to +1132
}

var resp container.CreateResponse
if req.Reuse {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: combine the two if's if the above isn't removed, based on previous question

if !ok {
// Calculate the hash of the file content only if it is a file.
fileContent, err = os.ReadFile(f.HostFilePath)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug: this will lead to unexplained issues.

// the hash will be zero to avoid breaking the hash calculation.
if f.Reader != nil {
fileContent, err = io.ReadAll(f.Reader)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug: this will lead to unexplained issues.

}
} else {
ok, err := isDir(f.HostFilePath)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug: this will lead to unexplained issues.

)

// containerHash represents the hash of the container configuration
type containerHash struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: are two independent hashes needed.

// Read the file content to calculate the hash, if there is an error reading the file,
// the hash will be zero to avoid breaking the hash calculation.
if f.Reader != nil {
fileContent, err = io.ReadAll(f.Reader)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest: Use streaming hash instead of reading files into memory.

@mdelapenya mdelapenya marked this pull request as draft September 10, 2024 10:05
@mdelapenya
Copy link
Member Author

@kiview I'm converting to draft until we discuss internally about the implications of switching from the container name-based approach to a hash-based approach. Will share here the discussion once it takes place.

if req.Reuse {
// Remove the SessionID label from the request, as we don't want Ryuk to control
// the container lifecycle in the case of reusing containers.
delete(req.Labels, core.LabelSessionID)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed to not remove the sessionId but an specific label when ryuk is enabled. TBD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Causing compatibility issues.
Projects
None yet
4 participants