Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Meta OrdinalClassifier estimator #611

Merged
merged 5 commits into from
Jan 29, 2024

Conversation

FBruzzesi
Copy link
Collaborator

@FBruzzesi FBruzzesi commented Jan 22, 2024

Description

Fix #607 by introducing OrdinalClassifier (meta) estimator.

Checklist

  • Add docstrings
  • Add documentation section
  • Add unit tests

* unit tests

* docs

* change calibration api
Examples of this kind of problem are: predicting customer satisfaction on a scale from 1 to 5, predicting the severity of a disease, predicting the quality of a product, etc.

The [`OrdinalClassifier`][ordinal-classifier-api] is a meta-model that can be used to transform any classifier into an ordinal classifier by fitting N-1 binary classifiers, each handling a specific class boundary, namely: $P(y <= 1), P(y <= 2), ..., P(y <= N-1)$.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a nit, but might it make sense to add this diagram ...

CleanShot 2024-01-26 at 15 57 39

... somewhere over here? ...

CleanShot 2024-01-26 at 15 56 56

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes 110% sense! Collapsed admonition should do the trick

Comment on lines 213 to 228
def score(self, X, y):
"""Returns the accuracy score on the given test data and labels.

Parameters
----------
X : array-like of shape (n_samples, n_features )
The training data.
y : array-like of shape (n_samples,)
The target values.

Returns
-------
score : float
Accuracy score of self.predict(X) wrt. y.
"""
return accuracy_score(y, self.predict(X))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the ClassifierMixin provide this already? If we want to do something custom here, perhaps it's better to re-use the score() method of the underlying classifier?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are right! I am removing it for now 😊
Not sure what something custom and default would be.

I am personally not paying much attention to accuracy in the ordinal problem I am dealing with, but don't want to put anything very opinionated, so let's go with the hierarchical implementation from ClassifierMixin

Copy link
Owner

@koaning koaning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! It feels like this is about 90% done, but I did find some things to address. Relatively minor things though!

self.classes_ = np.sort(np.unique(y))
self.n_features_in_ = X.shape[1]

if self.n_classes_ < 2:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder ... don't we need at least three classes for ordinal regression to make sense? Any binary classification with two classes is automatically ordinal, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also very true! I cheated for the scikit-learn tests 😂
Let me figure out how to deal with those

Comment on lines +23 to +27
"check_dict_unchanged",
"check_fit2d_1feature",
"check_classifier_data_not_an_array",
"check_classifiers_classes",
"check_classifiers_train",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fair amount of skipping, yet:

  • coverage is still same
  • the internal estimator will most likely deal directly with these cases (e.g. data not array)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems fair.

@FBruzzesi FBruzzesi changed the title WIP: Meta OrdinalClassifier [Feature]: Meta OrdinalClassifier estimator Jan 27, 2024
@FBruzzesi FBruzzesi marked this pull request as ready for review January 27, 2024 10:26
@FBruzzesi FBruzzesi requested a review from koaning January 29, 2024 13:59
@koaning koaning merged commit e65701a into koaning:main Jan 29, 2024
14 checks passed
@FBruzzesi FBruzzesi deleted the ordinal-classification branch January 29, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Meta Ordinal Classification
2 participants