Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SKL-like random forest Python API. #4148

Merged
merged 11 commits into from
Mar 12, 2019

Conversation

canonizer
Copy link
Contributor

@canonizer canonizer commented Feb 15, 2019

  • added XGBRFClassifier and XGBRFRegressor classes to Scikit-Learn-like xgboost API
  • added documentation describing how to use xgboost for random forests,
    as well as existing caveats

canonizer and others added 6 commits February 15, 2019 21:05
- added XGBRFClassifier and XGBRFRegressor classes to SKL-like xgboost API
- also added n_gpus and gpu_id parameters to SKL classes
- added documentation describing how to use xgboost for random forests,
  as well as existing caveats
- introduced new parameters to XGBRanker
- also passing **kwargs further in XGBRanker.__init__
Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because many of our internal parameters can be subject to change I do not like to bake them all into the Python API. Parameters such as 'gpu_id' can be passed as kwargs. I don't think this PR should require any changes to the actual function arguments.

As mentioned before, I do not like the use of self.mode to change behaviour, I think this should be handled via polymorphism.

- removed the n_gpus and gpu_id parameters
- removed the mode parameter, using subclass methods instead
- small fixes
@canonizer
Copy link
Contributor Author

I've removed n_gpus and gpu_id. However, I've left tree_method and colsample_bynode. Please indicate if you want me to remove those two (note that other colsample_by* parameters are already part of SKL-like classes).

I've also removed self.mode, and now handle it using subclass methods.

@RAMitchell
Copy link
Member

@trivialfis can I get your review.

@canonizer xgboost's default tree_method is currently 'exact'. This PR would change the sklearn interface to use 'hist' as the default. This would be a huge change for our user base. Arguably we should in fact make this change, but that is a big decision and that we need some internal discussion on.

@trivialfis
Copy link
Member

@RAMitchell Will review shortly.

doc/rf.rst Show resolved Hide resolved
doc/rf.rst Outdated Show resolved Hide resolved
python-package/xgboost/sklearn.py Outdated Show resolved Hide resolved
python-package/xgboost/sklearn.py Outdated Show resolved Hide resolved
python-package/xgboost/sklearn.py Outdated Show resolved Hide resolved
python-package/xgboost/sklearn.py Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member

@RAMitchell @hcho3 WDYT?

Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can't remove the 'silent' parameter yet. Given that this is a public API we should continue to support the parameter alongside 'verbosity' and issue a deprecation warning for at least a few subsequent releases.

Have you manually checked the python documentation @canonizer? It would be good to check if this formats correctly on your local machine.

I am also wondering if we can somehow mark the RF extension to the API as experimental to give ourselves some room to modify it in the near future. @hcho3 WDYT?

Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Yes, let’s mark them experimental so that we can fix API design as necessary.

@trivialfis
Copy link
Member

I will make a separated PR for marking this experimental later. Let's merge it now. @canonizer @RAMitchell @hcho3 .

@trivialfis trivialfis merged commit a36c3ed into dmlc:master Mar 12, 2019
@canonizer
Copy link
Contributor Author

#4255 brings back the silent parameter and marks it deprecated.

@hcho3 hcho3 mentioned this pull request Apr 21, 2019
18 tasks
@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2019
@mtjrider mtjrider deleted the fea-ext-rf-api branch August 19, 2019 17:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants