Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect NodeConnectionsService from stale conns #92558

Conversation

DaveCTurner
Copy link
Contributor

A call to ConnectionTarget#connect which happens strictly after all calls that close connections should leave us connected to the target. However concurrent calls to ConnectionTarget#connect can overlap, and today this means that a connection returned from an earlier call may overwrite one from a later call. The trouble is that the earlier connection attempt may yield a closed connection (it was concurrent with the disconnections) so we must not let it supersede the newer one.

With this commit we prevent concurrent connection attempts, which avoids earlier attempts from overwriting the connections resulting from later attempts.

When combined with #92546, closes #92029

A call to `ConnectionTarget#connect` which happens strictly after all
calls that close connections should leave us connected to the target.
However concurrent calls to `ConnectionTarget#connect` can overlap, and
today this means that a connection returned from an earlier call may
overwrite one from a later call. The trouble is that the earlier
connection attempt may yield a closed connection (it was concurrent with
the disconnections) so we must not let it supersede the newer one.

With this commit we prevent concurrent connection attempts, which avoids
earlier attempts from overwriting the connections resulting from later
attempts.

When combined with elastic#92546, closes elastic#92029
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Network Http and internode communication implementations v8.7.0 labels Dec 23, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team (obsolete) label Dec 23, 2022
@DaveCTurner
Copy link
Contributor Author

A different (possibly simpler?) approach from #92555. We should choose one.

Copy link
Contributor

@kingherc kingherc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I like using synchronized better than #92555 's approach of using atomic values! Because it makes the code/logic simpler to understand.

Just have one small question you could answer before approval.

Copy link
Contributor

@kingherc kingherc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jan 3, 2023
@elasticsearchmachine elasticsearchmachine merged commit 1a650ec into elastic:main Jan 3, 2023
@DaveCTurner DaveCTurner deleted the 2022-12-23-fix-92029-alternative branch January 3, 2023 11:56
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Nov 9, 2023
A call to `ConnectionTarget#connect` which happens strictly after all
calls that close connections should leave us connected to the target.
However concurrent calls to `ConnectionTarget#connect` can overlap, and
today this means that a connection returned from an earlier call may
overwrite one from a later call. The trouble is that the earlier
connection attempt may yield a closed connection (it was concurrent with
the disconnections) so we must not let it supersede the newer one.

With this commit we prevent concurrent connection attempts, which avoids
earlier attempts from overwriting the connections resulting from later
attempts.

Backport of elastic#92558
When combined with elastic#101910, closes elastic#100493
elasticsearchmachine pushed a commit that referenced this pull request Nov 9, 2023
A call to `ConnectionTarget#connect` which happens strictly after all
calls that close connections should leave us connected to the target.
However concurrent calls to `ConnectionTarget#connect` can overlap, and
today this means that a connection returned from an earlier call may
overwrite one from a later call. The trouble is that the earlier
connection attempt may yield a closed connection (it was concurrent with
the disconnections) so we must not let it supersede the newer one.

With this commit we prevent concurrent connection attempts, which avoids
earlier attempts from overwriting the connections resulting from later
attempts.

Backport of #92558
When combined with #101910, closes #100493
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Distributed Coordination/Network Http and internode communication implementations Team:Distributed Meta label for distributed team (obsolete) v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] NodeConnectionsServiceTests testEventuallyConnectsOnlyToAppliedNodes failing
3 participants