Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Use clientId to reject connection requests for peers with existing UCX endpoints #3106

Open
abellina opened this issue Jul 30, 2021 · 0 comments
Labels
feature request New feature or request P2 Not required for release shuffle things that impact the shuffle plugin

Comments

@abellina
Copy link
Collaborator

This is a tracking issue for work that is going to go into UCX 1.12, so it won't be done anytime soon. That said, I'd like to use this to track our progress testing it.

The issue is that when we connect to peer executors there are two ways of doing that: connect to peer UcpListener, or a peer is connecting to our UcpListener. When we handle a connection from a peer we do not know anything about the remote peer (we just get a "connection request" object from UCX but it doesn't have any id). Because of this, we need to create UCX endpoints and handshake data, which can cause us to loose a race adding extra UCX endpoints. This is not a functional bug, but a resource waste we'd like to fix.

In UCX 1.12 executor A should be able to send the executorId with the connection request to a peer (B), and the request may be rejected if executor B already had initiated a request to executor A.

This is blocked by: openucx/ucx#7136, and the JUCX jar + UCX native libraries for 1.12 being available.

@abellina abellina added feature request New feature or request ? - Needs Triage Need team to review and classify shuffle things that impact the shuffle plugin P2 Not required for release labels Jul 30, 2021
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request P2 Not required for release shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants