Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SLEP010 n_features_in_ attribute #22
SLEP010 n_features_in_ attribute #22
Changes from 10 commits
354a6a0
df083c4
ecff33d
08630ed
5a247e7
732dc34
f26bc32
78a0d8e
8d4ccb6
593e92c
9cee1c9
2f37147
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we say "private" do we mean that we do not authorise third party libraries to rely on this API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I added a note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We should probably support the
:issue:
role in this repo)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are categorically different, since estimators implemented according to our developers' guide would continue to work after these checks were added. The current proposal does not do that, without an update to the developers' guide, hence making the new requirement force the version of scikit-learn with which an estimator is compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about when
check_classifiers_classes
was introduced (can't find the original PR)?That check fails on
HistGradientBoostingClassifier
with its default parameters (n_samples_leaves=20 is too high for this dataset with 30 samples).And
HistGradientBoostingClassifier
is definitely implemented according to our developers guide.I'm sure we have tons of instances like that where our tests are so specific that they will break existing well-behaved estimators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StackingClassifier and StackingRegressor required a change in the common tests (I think sample weight tests), so adding those sample weight tests also would have broken other estimators that are implemented according to the developers guide.
In some sense the requirement added here is qualitatively different in that it requires a new attribute. But I'm not sure if that's a difference in practice. We add a test and the third-party developer has his test break, and needs to change the code to make the test work.
I'm not sure if it makes a difference to the third-party developer whether the breakage was due to an implicit detail of the tests or an explicit change of the API. I would argue the second one might actually be less annoying.
What is unfortunate about the change is that it makes it hard for a third-party developer to be compatible with several versions of scikit-learn. However, I would suggest that they keep using
check_array
and implementn_features_in_
themselves.If/when we do feature name consistency, this will might be a bit trickier because it might require a bit more code to implement, but I don't think it's that bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm asking is that these aspects be considered and noted, so that the vote on the SLEP takes this into account rather than allowing the reviewers to miss this compatibility issue.