-
-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Ryuk might shutdown reused Container while connected to it #2445
Comments
As I know reaper is grabbing containers by labels, where session id is already stored. Could you please attach labels with your case? |
Yes, that is correct. Reaper will kill the container based on the matching Containers created on first test run:
Now I'm running tests for a second time, see the output, where the ClickHouse container with ID
At some point during the test, the previous Reaper kills the existing ClickHouse, as you can see with the mapped port returning EOF. And a new container is created:
A new container is created, but the killed container already causes some tests to fail. |
I'm not sure if it is related, but I ran into an issue when I moved tests into different packages (so different TestMain to setup testcontainers k3s). After one package finished, the next dies unless I disable ryuk
|
I think that could indeed be related. Two packages run with two different So either Reaper should be fully re-used as well, with the same semantics (e.g. SessionID not part of the container name), or the 'reused' container needs to be scoped to the session / package as well. Now I could see both making sense from a user perspective, so it has to be a config option somehow. |
I think I'm being hit by the same problem - I have two separate packages using the same
As a workaround, I'm setting |
This should be fixed by testcontainers/moby-ryuk#121, which needs a release and an update to testcontainters-go to use the new image. Try cloning the moby-ryuk repo and running the following in it to replace the image that testcontainers-go uses to see if it does fix: docker build -f linux/Dockerfile -t testcontainers/ryuk:0.7.0 . |
Hi @stevenh ,
The problem is that the 'old' Ryuk instance doesn't get any connection from testcontainers on the second run, and brings down any reused containers with it. (for completeness I did try your guidelines above but running integration tests shortly in succession still makes them fail second time) |
Hi all, sorry for not going back to this issue, for some reason I was looking at other ones. My fault.
This is not correct. The SessionID is obtained by the parent process ID for the test process, which for each package and subpackage is the original I'm also confirming that this issue is hit when reusing containers, is that correct?
I think we should mark the |
@mdelapenya no rush, and thanks for any input you can give. Re:
Yes, as far as I'm able to reproduce and understand the issue it's only hit when reusing containers.
I would very much dislike that, because it makes a major difference in run time for our integration tests. And not knowing when a more comprehensive reuse mode would land (or if this even will remain a 'free' feature??), I would very much like to keep using this feature. And as I stated in my original post I think the solution could be fairly simple:
For which I'm even willing to work on a fix, given there is some guidance on a preferred solution. |
@mdonkers thanks for your feedback. I'm sensitive with your use case and appreciate your input. If the feature is used, we maintainers must take care of the use case. As a last resort, and I'm not advancing anything, just giving alternatives, we could work with you to refine the test suite to avoid the reusability of the container looking for alternatives, but in any case, we must support you with your current setup, as promised by a public API. For the reuse use case, because the container name is fixed by each user, adding the session ID to the container name, and offering a way to retrieve it from the container (there are public methods in the container interface), or using Inspect, would be enough, right? That would be a very simple, and narrowed to the reuse use case, solution. If you want to work on that, I'll be more than happy to review it asap. |
Unfortunately, we have created a rather custom testing setup resembling some BDD structure, that would not allow us to restructure the tests in such a way that the container can otherwise be reused.
Yes, that would work. I'll be happy to look into this. Hopefully can manage this week but in any case I'll give a ping when I have a working PR. |
just curious, what technology is reused in that container? a database? postgres? |
@mdonkers I've thought about this more in depth: if we append the sessionID to the container name, then two different test sessions (two terminals) will see different container names. Is that what you expect? |
This should be fixed by a combination of testcontainers/moby-ryuk#141 and the reaper changes in #2664 |
I think #2768 will solve this. Would appreciate your review 🙏 |
Hi @mdelapenya , Sorry I couldn't get to this earlier. Vacation and deadlines got in the way... My only issue is that now Ryuk doesn't shut down the container at all anymore. I'd like to have the option to enable that. |
You mean the reusable container? But that's expected, right? If we want to reuse the container, ryuk will not take control of its termination. You can leverage the Terminate method of the container and use it when you need it (e.g. in a TestMain function). Is my assumption correct, or you mean anything else? Thanks for trying it out in any case 🙏 |
No, I think its actually useful the reusable container gets torn down by Ryuk. That's happening currently as well. Otherwise I could just have disabled Ryuk.
The problem of this is that we have integration tests in separate packages. So a The problem before was Ryuk tearing down a container that was in-use, by starting test runs too close after each other. In that sense the hashing might not be sufficient to solve that? |
@mdonkers thanks for this, it's great feedback 🙇
But then, how Ryuk would know when to kill the container? With this design, it's possible to reuse the containers across test sessions (another terminal running the tests), so Ryuk, which is unique per test session, doesn't need to add the reusable containers in the death note.
The it would apply to all your containers, not that one that is reusable.
So you have a PostStarts lifecycle hook, or just executing c.Exec after the container starts, right? In this case, the hashing mechanism is not considered, as functions are not hashed (lifecycle hooks) and in-test executions cannot be considered. For the case you mentioned, where you start a DB and populate it with client calls, why not committing that image and starting from it? At the moment, the public APIs don't expose that feature, but you can get access to the underlying client and do it yourself. PTAL at the Ollama module, we do that to commit the image after loading a model. |
Currently it kills it after some timeout (after all connections to Ryuk are gone). Because the whole suite of integration tests runs fast enough (and largely in parallel), there's always some test case with a connection to Ryuk keeping it alive. We don't need reuse across test sessions, only for a single (full) run. But if we don't mark the container as reusable, every test will create its own instance.
Yes, also that :-) Though less of an issue because we're mostly using only 1 DB container.
In the 'framework' we have, every set of tests has But indeed, this all has no effect on the hashing, so that's also why it will create only a single container.
See also above; our tests are now pretty flexible and I simply wouldn't want the overhead of building separate containers somewhere in our CI pipeline and maintaining all of that. It would also mean if we do table migrations, we first need to build a new container before being able to create tests and verify the migrations + code using them works. I'm realizing that it might be hard to grasp our setup and why or how things work. Unfortunately because the test-related code isn't completely separated from business logic, I cannot share it publicly and cleaning up would take too much time. If you want/ need I'm happy to find some other way of giving more insight though. |
I agree with @mdonkers the reaper should still clear down reused containers. Thinking about this, I'm not sure hashing is the right approach, it seems better on the face as it ensures that another container isn't mistakenly used but on the flip side its taking control away from the user. One example that comes to mind is a container that alters the files injected by the container request, who are we to say that the user doesn't want to reuse that container just because the files got updated? Reusing by name puts the user 100% in control, sure they can shoot themselves in the foot, but that's up to them. Overall for this issue, I don't think we need anything more than fixed reaper as per #2728 and testcontainers/moby-ryuk#141 |
Testcontainers version
0.29.1
Using the latest Testcontainers version?
Yes
Host OS
Linux
Host arch
x86
Go version
1.22
Docker version
Docker info
What happened?
As I was improving some integration tests in our own project, I sometimes noticed failures after we switched to reusing containers. Since I was improving the run-time of the tests, I was executing ITs several times after each other to make sure code compilation was not included in the time (
while go clean -testcache && make integration-test; do :; done
. I had a suspicion that me quickly running tests after each other was related to the failures so I did some further investigation.From testcontainers output I saw several times that more than 1 container was created (while
Reuse: true
):As you see, in the same test-run a new ClickHouse container gets created while we never call
Terminate
. Tests start failing as they can no longer connect to the old container (to which they hold a connection based on the mapped port).My suspicion was that Ryuk was for some reason terminating the 'old' ClickHouse container, still live from a previous run.
Looking at the code, this appears indeed what is causing it:
testcontainers-go/docker.go
Line 1177 in c83b93c
SessionID
:testcontainers-go/docker.go
Line 1201 in c83b93c
To fix this, my proposal would be to somehow add the
SessionID
also to the 'reusable' container name. Either implicitly or by some flag or by exposing theSessionID
so the user can add it (currently it's part of theinternal
package so not reachable).I'm happy to work on a fix, if I can get any suggestion for a preferred approach.
Relevant log output
docker ps -a
output after the failure, showing two reapers and one CH container:Additional information
No response
The text was updated successfully, but these errors were encountered: