-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
garden running in bpm does not find/delete existing containers on startup #120
Comments
We have created an issue in Pivotal Tracker to manage this: https://www.pivotaltracker.com/story/show/163269392 The labels on this github issue will be updated when the story is started. |
We observe that when garden is stopped via We also observe that the depot contains no trace of our container. We can see from BPM code and logs that BPM attempts the following on gdn:
Outside of bpm in CF, garden is configured to destroy containers on start up. However, when garden dies, the container continue to function as normal, garden kills the containers next time it starts up. Within BPM, because of the above, garden containers die when the garden bpm container dies. Within a pid namespace, it is documented in man7 that when you kill the init process (pid 1), the kernel will send SIGKILL to all other processes in the namespace. Experimentally, we've observed that this applies to all nested child pid namespaces. The conclusion is that, if we wish to continue to use BPM, then it is expected behaviour for containers to die when garden dies. TODOs
|
This appears fixed since 1.18.1, are we okay to close? |
@BooleanCat ^^ |
Submodule src/garden-integration-tests a3a11e8e..f1f7dd8f: > Update go.mod dependencies > Merge pull request #120 from cloudfoundry/fix-run-standalone-gdn-gats/builds/8.1 Submodule src/netplugin-shim 345979f7..2f59faed: > Update go.mod dependencies
…eptance-tests grootfs guardian idmapper Submodule src/dontpanic 2611f6b1..a9aaa67b: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies Submodule src/garden fba22f3d..e596c7c5: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Merge pull request #122 from cloudfoundry/fix-g104 > Merge pull request #120 from cloudfoundry/g306-fix Submodule src/garden-integration-tests 797d06aa..101e55e4: > Update go.mod dependencies Submodule src/garden-performance-acceptance-tests e6276077..46b13b45: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies Submodule src/grootfs 1dd33356..ea023c7b: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > fix links in readme > fix link to ginkgo repo > fix broken docker docs links > Update go.mod dependencies > Merge pull request #269 from cloudfoundry/g301-followup > Merge pull request #270 from cloudfoundry/fix-g104 > Update go.mod dependencies > Update go.mod dependencies > Merge pull request #268 from cloudfoundry/fix-g110 > Merge pull request #267 from cloudfoundry/g306-fix > Update go.mod dependencies Submodule src/guardian d7f54dca..8fdda0bb: > Update go.mod dependencies > Merge pull request #455 from cloudfoundry/nstar-arm64 > Merge pull request #454 from cloudfoundry/go-1.23-test-update > Merge pull request #453 from cloudfoundry/fix-g104 > Merge pull request #451 from cloudfoundry/g306-fix > Remove toolchain for compatiblity. Submodule src/idmapper 34b682d8..7f68486a: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > fix link in readme > Update go.mod dependencies > Merge pull request #80 from cloudfoundry/fix-g104
Description
When running garden in bpm, any orphaned containers are not cleaned up on restarts of the garden job. See reproduction steps below.
This affected us in our CI environment as these orphaned containers left state in the form of container network interfaces/network namespace files on the host VM. Subsequent container creations failed during network configuration b/c of this polluted state, causing Diego cells in CF to become unhealthy periodically.
Links
Environment
1.17.2
Steps to reproduce
bosh stop diego-cell
(ormonit stop garden
from the cell)garden-init
process still running from the stranded application instancebosh start diego-cell
(ormonit start garden
from the cell)garden-init
processes, the original stranded application (now pid 1) and another for the rescheduled appclean-up-container
log line corresponding to the stranded containerifconfig
and see there are two container network interfacesNote: stopping garden seems to only orphan 1 extra
garden-init
process if done repeatedly but you can see from the stranded container network interfaces that garden failed to fully delete many containersCause
Resolution
The text was updated successfully, but these errors were encountered: