[fix] fix `test_estimators[LogisticRegression()-check_estimators_unfitted]` conformance for gpu support #2109

icfaust · 2024-10-15T04:56:26Z

Description

Fixes issues in GPU conformance on private CI. None of the methods of LogisticRegression in scikit-learn return or store sparse arrays, which means the check in _onedal_gpu_predict_supported is an unnecessary one. When the check requiring fitted variables coef_ or intercept_ are removed, the underlying check_is_fitted calls do what is necessary to pass this sklearn conformance test. Here is an example of fitting sklearns' LogisticRegression with sparse data which yields a numpy array:

from sklearn.linear_model import LogisticRegression
import numpy as np
import scipy.sparse as sp
X = sp.csr_matrix(np.eye(10))
y = np.arange(10) % 2
est = LogisticRegression()
est.fit(X,y)
print(type(est.coef_))

Will yield:
<class 'numpy.ndarray'>

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least summary table with measured data, if performance change is expected.
I have provided justification why performance has changed or why changes are not expected.
I have provided justification why quality metrics have changed or why changes are not expected.
I have extended benchmarking suite and provided corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

icfaust · 2024-10-15T06:10:14Z

/intelci: run

ahuber21

It may be a preferred alternative to use issparse(getattr(self, "coeff_")) and issparse(getattr(self, "intercept_")) instead.
I realize that right now there's no difference because there are no normal scenarios in which those attributes will be sparse. But we may want to come back to it later.

A second thought is that the order of operations

check if GPU is supported
check if model is fitted

could be swapped all together. That should also avoid the AttributeError.

I'm approving, asking you to merge if all CIs are green. Let's keep the comment above in mind for future refactorings.

icfaust · 2024-10-15T08:26:36Z

It may be a preferred alternative to use issparse(getattr(self, "coeff_")) and issparse(getattr(self, "intercept_")) instead. I realize that right now there's no difference because there are no normal scenarios in which those attributes will be sparse. But we may want to come back to it later.

A second thought is that the order of operations

check if GPU is supported

check if model is fitted

could be swapped all together. That should also avoid the AttributeError.

I'm approving, asking you to merge if all CIs are green. Let's keep the comment above in mind for future refactorings.

Definitely agree. Checking model is fitted before other steps should be a rule.

Update logistic_regression.py

c98f98b

icfaust changed the title ~~]fox~~ [fix] add check_is_fitted before results check in LogisticRegression._onedal_predict_supported Oct 15, 2024

icfaust changed the title ~~[fix] add check_is_fitted before results check in LogisticRegression._onedal_predict_supported~~ [fix] add check_is_fitted before results check in LogisticRegression._onedal_gpu_predict_supported Oct 15, 2024

Update logistic_regression.py

43a30bb

icfaust changed the title ~~[fix] add check_is_fitted before results check in LogisticRegression._onedal_gpu_predict_supported~~ [fix] fix test_estimators[LogisticRegression()-check_estimators_unfitted] conformance for gpu support Oct 15, 2024

icfaust marked this pull request as ready for review October 15, 2024 07:09

icfaust requested review from Alexsandruss and samir-nasibli as code owners October 15, 2024 07:09

icfaust added the bug Something isn't working label Oct 15, 2024

icfaust requested a review from avolkov-intel October 15, 2024 07:36

ahuber21 approved these changes Oct 15, 2024

View reviewed changes

icfaust merged commit 64d71cf into uxlfoundation:main Oct 15, 2024
35 of 48 checks passed

icfaust mentioned this pull request Oct 15, 2024

[enhancement] enforce use of sklearn's check_is_fitted before checking oneDAL support #2110

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] fix `test_estimators[LogisticRegression()-check_estimators_unfitted]` conformance for gpu support #2109

[fix] fix `test_estimators[LogisticRegression()-check_estimators_unfitted]` conformance for gpu support #2109

icfaust commented Oct 15, 2024 •

edited

Loading

icfaust commented Oct 15, 2024

ahuber21 left a comment

icfaust commented Oct 15, 2024

[fix] fix test_estimators[LogisticRegression()-check_estimators_unfitted] conformance for gpu support #2109

[fix] fix test_estimators[LogisticRegression()-check_estimators_unfitted] conformance for gpu support #2109

Conversation

icfaust commented Oct 15, 2024 • edited Loading

Description

icfaust commented Oct 15, 2024

ahuber21 left a comment

Choose a reason for hiding this comment

icfaust commented Oct 15, 2024

[fix] fix `test_estimators[LogisticRegression()-check_estimators_unfitted]` conformance for gpu support #2109

[fix] fix `test_estimators[LogisticRegression()-check_estimators_unfitted]` conformance for gpu support #2109

icfaust commented Oct 15, 2024 •

edited

Loading