Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package][dask] handle failures parsing worker host names #4852

Merged
merged 5 commits into from
Dec 6, 2021

Conversation

jameslamb
Copy link
Collaborator

Contributes to #3867.

This PR fixes mypy errors in dask.py

python-package/lightgbm/dask.py:99: error: Invalid index type "Optional[str]" for "Dict[str, _HostWorkers]"; expected type "str"
python-package/lightgbm/dask.py:383: error: Invalid index type "Optional[str]" for "defaultdict[str, Set[int]]"; expected type "str"

This is mypy correctly pointing out the urlparse(something).hostname can be None, but lightgbm.dask treats as if it is guaranteed to be non-None.

from urllib.parse import urlparse

print(urlparse("https://127.0.0.1:123").hostname)
# 127.0.0.1

print(urlparse("127.0.0.1:123").hostname)
# None

The changes in this PR add explicit handling for that case, so that if it happens lightgbm will raise a more specific and helpful error.

Confirmed that this suppresses the error by running the following.

mypy \
    --exclude='python-package/compile/|python-package/build' \
    --ignore-missing-imports \
    python-package/

Copy link
Collaborator

@jmoralez jmoralez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think right now the _group_workers_by_host function can't get invalid addresses since it gets them from dask, however it's nice to have that function tested in case it ever needs to be used somewhere else.

@jameslamb
Copy link
Collaborator Author

can't get invalid addresses since it gets them from dask

You're right, I don't know any specific case right now where a worker address could make it to this method that doesn't have a hostname in the parsing result from urlparse().

But I think mypy is right here that it is POSSIBLE for urlparse(address).hostname to be None, and I think we'll be glad to have this specific error message there to make debugging faster.

@jameslamb jameslamb merged commit 630f2e7 into master Dec 6, 2021
@jameslamb jameslamb deleted the mypy-dask-urlparse branch December 6, 2021 18:57
@StrikerRUS StrikerRUS mentioned this pull request Jan 6, 2022
13 tasks
@jameslamb jameslamb mentioned this pull request Oct 7, 2022
40 tasks
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants