-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RF: python api behaviour refactor #4207
RF: python api behaviour refactor #4207
Conversation
rerun tests |
@venkywonka I just reproduced the issue of CI in plain branch-21.10 locally, so on Monday we'll work on unblocking CI |
that's great @dantegd, thank you 🙏 |
rerun tests |
The latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-approving, just had one comment, though I could deal with in in #4196 after merging this
"the number of samples used for training. " | ||
"Changing `n_bins` to number of training samples." | ||
in str(w[-1].message)) | ||
print(str(w[-1].message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(str(w[-1].message)) |
I don't think it is necessary to print the message, maybe only if it is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yea, that's on me will get rid of it, dante
@gpucibot merge |
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #4207 +/- ##
===============================================
Coverage ? 86.07%
===============================================
Files ? 231
Lines ? 18633
Branches ? 0
===============================================
Hits ? 16039
Misses ? 2594
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
This PR ⬇️ * fixes rapidsai#4193 and fixes rapidsai#4194 that relates to API incompatibility with dask-ml GridSearchCV * changes the behaviour of cuml RF in the following cases: * In the not-so-uncommon case when `n_bins` > number of rows in training sample, instead of throwing error and exiting, the estimator is made to print a warning and use the `n_bins` as the number of training samples. * When `.predict()` is called using `float64` data, instead of throwing an error asking user to explicitly specify `predict_model="CPU"` and rerun, a warning is displayed and implicity defaults to CPU-based prediction from the default GPU-based prediction. * Corresponding tests to capture the warnings from above added * the estimators now accept both numbers and strings as input for `split_criterion` parameter thus in parity with sklearn's API that takes in strings as criterion. * `split_algo` and `use_experimental_backend` parameters of the estimator class have now been completely removed from both documentation and warnings after deprecation in previous releases (from both single-gpu and dask RF). * `num_classes` parameter of predict and score methods have also been similarly removed Authors: - Venkat (https://github.com/venkywonka) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Rory Mitchell (https://github.com/RAMitchell) URL: rapidsai#4207
This PR ⬇️
n_bins
> number of rows in training sample, instead of throwing error and exiting, the estimator is made to print a warning and use then_bins
as the number of training samples..predict()
is called usingfloat64
data, instead of throwing an error asking user to explicitly specifypredict_model="CPU"
and rerun, a warning is displayed and implicity defaults to CPU-based prediction from the default GPU-based prediction.split_criterion
parameter thus in parity with sklearn's API that takes in strings as criterion.split_algo
anduse_experimental_backend
parameters of the estimator class have now been completely removed from both documentation and warnings after deprecation in previous releases (from both single-gpu and dask RF).num_classes
parameter of predict and score methods have also been similarly removed