Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dask] Add scheduler address to dask config. #7581

Merged
merged 11 commits into from
Jan 21, 2022

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Jan 19, 2022

See added doc for use case.

  • Add user configuration.
  • Bring back to the logic of using scheduler address from dask. This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.

Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't review this :) Can a dask person take a look?

Copy link

@kevingerman kevingerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for the comment about how the RabitTracker constructor fallback is handled. Just don't want calls in an except block that can cause another exception.

python-package/xgboost/dask.py Outdated Show resolved Hide resolved
python-package/xgboost/dask.py Show resolved Hide resolved
Copy link

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dask things in here generally seem fine to me. A couple of comments about how the config system is being used.

# 2. Guessed by xgboost `get_host_ip` function
# 3. From dask scheduler
# We try 1 and 3 if 1 is available, otherwise 2 and 3.
valid_config = ["scheduler_address"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When extending the Dask config system we typically create a new YAML with default values in and extend the Dask config with it.

See how we do it in dask-ctl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sharing, let me look into that.

for k in dconfig:
if k not in valid_config:
raise ValueError(f"Unknown configuration: {k}")
host_ip: Optional[str] = dconfig.get("scheduler_address", None)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this dconfig object is being passed around instead of accessing dask.config.get("xgboost.scheduler_address") directly.

Copy link
Member Author

@trivialfis trivialfis Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't access it on the worker side for some reason (key not found). I read the doc from https://docs.dask.org/en/stable/configuration.html , it's not quite clear to me how to make sure everything is synced.

Copy link
Member Author

@trivialfis trivialfis Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also read this section https://docs.dask.org/en/stable/configuration.html#downstream-libraries . In XGBoost dask is loaded lazily so having dask in init might be less desirable due to the size of dependency. So I went on and read this:

However, downstream libraries may choose alternative solutions, such as isolating their configuration within their library, rather than using the global dask.config system. All functions in the dask.config module also work with parameters, and do not need to mutate global state.

So far using this parameter passing approach seems to be the easiest way to get 1 parameter in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Generally the Dask config should be available on the scheduler/workers but that can be deployment dependent.

@trivialfis trivialfis merged commit ef4dae4 into dmlc:master Jan 21, 2022
@trivialfis trivialfis deleted the scheduler-addr branch January 21, 2022 17:56
trivialfis added a commit to trivialfis/xgboost that referenced this pull request Jan 25, 2022
- Add user configuration.
- Bring back to the logic of using scheduler address from dask.  This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.
ajschmidt8 added a commit to rapidsai/xgboost that referenced this pull request Jan 26, 2022
[backport] [dask] Add scheduler address to dask config. (dmlc#7581)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants