Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay Connection#onRemoved while pending #101910

Conversation

DaveCTurner
Copy link
Contributor

Today we call Transport.Connection#onRemoved, notifying any
removed-listeners, when the connection is closed and removed from the
connectedNodes map. However, it's possible for the connection to be
closed while we're still adding it to the map and setting up the
listeners, so this now-dead connection will still be found in the
pendingConnections and may be returned to a future call to
connectToNode even if this call was made after all the
removed-listeners have been called.

With this commit we delay calling the removed-listeners until the
connection is closed and removed from both the connectedNodes and
pendingConnections maps.

Backport of #92546 to 7.17
Relates #100493

Today we call `Transport.Connection#onRemoved`, notifying any
removed-listeners, when the connection is closed and removed from the
`connectedNodes` map. However, it's possible for the connection to be
closed while we're still adding it to the map and setting up the
listeners, so this now-dead connection will still be found in the
`pendingConnections` and may be returned to a future call to
`connectToNode` even if this call was made after all the
removed-listeners have been called.

With this commit we delay calling the removed-listeners until the
connection is closed and removed from both the `connectedNodes` and
`pendingConnections` maps.

Backport of elastic#92546 to 7.17
Relates elastic#100493
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Network Http and internode communication implementations v7.17.15 labels Nov 8, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team (obsolete) label Nov 8, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner added backport auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Nov 9, 2023
@DaveCTurner DaveCTurner removed the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Nov 9, 2023
Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit 7d975ab into elastic:7.17 Nov 9, 2023
11 checks passed
@DaveCTurner DaveCTurner deleted the 2023/11/08/testConcurrentConnectsAndCloses-7.17 branch November 9, 2023 19:53
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 9, 2023
A call to `ConnectionTarget#connect` which happens strictly after all
calls that close connections should leave us connected to the target.
However concurrent calls to `ConnectionTarget#connect` can overlap, and
today this means that a connection returned from an earlier call may
overwrite one from a later call. The trouble is that the earlier
connection attempt may yield a closed connection (it was concurrent with
the disconnections) so we must not let it supersede the newer one.

With this commit we prevent concurrent connection attempts, which avoids
earlier attempts from overwriting the connections resulting from later
attempts.

Backport of elastic#92558
When combined with elastic#101910, closes elastic#100493
elasticsearchmachine pushed a commit that referenced this pull request Nov 9, 2023
A call to `ConnectionTarget#connect` which happens strictly after all
calls that close connections should leave us connected to the target.
However concurrent calls to `ConnectionTarget#connect` can overlap, and
today this means that a connection returned from an earlier call may
overwrite one from a later call. The trouble is that the earlier
connection attempt may yield a closed connection (it was concurrent with
the disconnections) so we must not let it supersede the newer one.

With this commit we prevent concurrent connection attempts, which avoids
earlier attempts from overwriting the connections resulting from later
attempts.

Backport of #92558
When combined with #101910, closes #100493
@DaveCTurner DaveCTurner restored the 2023/11/08/testConcurrentConnectsAndCloses-7.17 branch June 17, 2024 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport >bug :Distributed Coordination/Network Http and internode communication implementations Team:Distributed Meta label for distributed team (obsolete) v7.17.15
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants