Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

up: fix race condition on network connect #10756

Merged
merged 1 commit into from
Jun 29, 2023
Merged

Conversation

milas
Copy link
Contributor

@milas milas commented Jun 29, 2023

What I did
Fix race conditions with network setup during up.

Engine API only allows at most one network to be connected as part of the ContainerCreate API request. Compose will pick the highest priority network.

Afterwards, the remaining networks (if any) are connected before the container is actually started.

The big change here is that, previously, the highest-priority network was connected in the create, and then disconnected and immediately reconnected along with all the others. This was racy because evidently connecting the container to the network as part of the create isn't synchronous, so sometimes when Compose tried to disconnect it, the API would return an error like:

container <id> is not connected to the network <network>

To avoid needing to disconnect and immediately reconnect, the network config logic has been refactored to ensure that it sets up the network config correctly the first time.

Related issue

https://docker.atlassian.net/browse/ENV-241

(not mandatory) A picture of a cute animal, if possible in relation to what you did
a spotted piglet or mini-pig

@milas milas requested a review from a team June 29, 2023 19:54
@milas milas self-assigned this Jun 29, 2023
@milas milas requested review from nicksieger, ndeloof, StefanScherer, ulyssessouza, glours and laurazard and removed request for a team June 29, 2023 19:54
Engine API only allows at most one network to be connected as
part of the ContainerCreate API request. Compose will pick the
highest priority network.

Afterwards, the remaining networks (if any) are connected before
the container is actually started.

The big change here is that, previously, the highest-priority
network was connected in the create, and then disconnected and
immediately reconnected along with all the others. This was
racy because evidently connecting the container to the network
as part of the create isn't synchronous, so sometimes when Compose
tried to disconnect it, the API would return an error like:
```
container <id> is not connected to the network <network>
```

To avoid needing to disconnect and immediately reconnect, the
network config logic has been refactored to ensure that it sets
up the network config correctly the first time.

Signed-off-by: Milas Bowman <milas.bowman@docker.com>
Copy link
Member

@nicksieger nicksieger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix and cleanup!

@codecov
Copy link

codecov bot commented Jun 29, 2023

Codecov Report

Patch coverage: 87.61% and project coverage change: +0.06 🎉

Comparison is base (3906a7a) 58.91% compared to head (10b290e) 58.98%.

Additional details and impacted files
@@            Coverage Diff             @@
##               v2   #10756      +/-   ##
==========================================
+ Coverage   58.91%   58.98%   +0.06%     
==========================================
  Files         113      113              
  Lines        9795     9774      -21     
==========================================
- Hits         5771     5765       -6     
+ Misses       3433     3423      -10     
+ Partials      591      586       -5     
Impacted Files Coverage Δ
pkg/compose/create.go 59.45% <82.35%> (+0.85%) ⬆️
pkg/compose/convergence.go 68.77% <96.55%> (+0.26%) ⬆️
pkg/compose/run.go 55.85% <100.00%> (+2.52%) ⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@ndeloof
Copy link
Contributor

ndeloof commented Jun 29, 2023

Nice! 🎉

@glours glours merged commit 0228437 into docker:v2 Jun 29, 2023
@ToshY
Copy link

ToshY commented Jul 6, 2023

@milas There are multiple reports in #10668 that this fix (released in 2.19.1) solves the not connected to network problem, but subsequently causes other issues in return.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants