Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate parameter names by default #159

Merged
merged 3 commits into from
Jan 13, 2023
Merged

Conversation

ExpandingMan
Copy link
Collaborator

I don't really like changing the default parameters will-he nill-he but multiple of us have gotten pretty annoyed by this so this PR turns on warnings for invalid parameter names by default.

It suffers from the usual issue of having its own stderr output so we can't run it through the Julia logger.

@ExpandingMan
Copy link
Collaborator Author

Of course this will also be blocked now thanks to windows. Sigh.

@trivialfis
Copy link
Member

Hi, xgb has a logging callback that can be used to redirect logging to bindings

@trivialfis
Copy link
Member

https://github.com/dmlc/xgboost/blob/72ec0c54841c03f9c7a81b72137234f2519273ab/python-package/xgboost/core.py#L234

@trivialfis
Copy link
Member

trivialfis commented Jan 12, 2023

A little bit context on the default.

The default is set to false not because there's potential danger in the validation implementation. Some bindings copy the internal default values (old scala binding) for training parameters like max leave and pass them into libxgboost even if it's not used. The default is just to avoid breaking those bindings. Once we are sure that all existing bindings out there can pass only those parameters that are needed, we can remove the validate_parameters altogether and just carry out the validation.

Of course this will also be blocked now thanks to windows. Sigh.

Been there. Nothing can stop the fun of knowing Windows CI have just failed.

@ExpandingMan
Copy link
Collaborator Author

Thanks! Can't believe I never noticed XGBRegisterLogCallback. This PR now also calls that on __init__ so logging output from libxgboost now gets redirected through the Julia logger.

@devmotion
Copy link
Contributor

This PR broke our tests: https://github.com/TuringLang/MCMCDiagnosticTools.jl/actions/runs/3914362811/jobs/6691330173#step:4:510 We are not doing anything fancy with XGBoost (in particular not passing any parameters that should be validated), but since we use XGBoost through MLJ I assume that the problem is the list of parameters here: https://github.com/JuliaAI/MLJXGBoostInterface.jl/blob/ab061e4b80069cd7884ee23e4d7bbe4834bdd7e9/src/MLJXGBoostInterface.jl#L54 It seems the different XGBoost models (classifier, regressor, ...) should use different sets of parameters and not every parameter is needed for every model.

@ExpandingMan
Copy link
Collaborator Author

ExpandingMan commented Jan 13, 2023

Not sure why your tests are failing on a warning, however, the way MLJXGBoostInterface.jl works is not ideal. In retrospect the choice to always have every model hold every possible parameter was probably not a great one and is just asking for these kinds of problems.

For the time being, I have made a PR to disable parameter validation in the MLJ interface by default.

@devmotion
Copy link
Contributor

It's an example in the docstrings, and different output (e.g., due to logging) causes doctests to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants