Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for UCP Listener created spark.port.maxRetries times #2476

Merged

Conversation

abellina
Copy link
Collaborator

Signed-off-by: Alessandro Bellina abellina@nvidia.com

Closes #2474

This bug shows when the UCP listener connection method is enabled (this is an internal config, and is disabled by default). It was just connecting up to all the retry attempts, instead of exiting once the listener could bind.

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>
@abellina abellina added bug Something isn't working shuffle things that impact the shuffle plugin labels May 21, 2021
@abellina
Copy link
Collaborator Author

@petro-rudenko fyi

jlowe
jlowe previously approved these changes May 21, 2021
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit but otherwise lgtm.

@jlowe
Copy link
Member

jlowe commented May 21, 2021

build

Copy link
Member

@petro-rudenko petro-rudenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@abellina abellina merged commit 49b0086 into NVIDIA:branch-21.06 May 21, 2021
@abellina abellina deleted the shuffle/exit_early_ucp_listener branch May 21, 2021 18:59
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Fix for UCP Listener created spark.port.maxRetries times

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

* Update attempt earlier to remove need fix in log
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* Fix for UCP Listener created spark.port.maxRetries times

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

* Update attempt earlier to remove need fix in log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] when ucp listener enabled we bind 16 times always
3 participants