Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes for SKL1.2 #661

Merged
merged 9 commits into from
Dec 9, 2022
Merged

Changes for SKL1.2 #661

merged 9 commits into from
Dec 9, 2022

Conversation

ksaur
Copy link
Collaborator

@ksaur ksaur commented Dec 8, 2022

For #660

@ksaur
Copy link
Collaborator Author

ksaur commented Dec 8, 2022

Now we also have 2 other SKL1.2 bugs

=========================== short test summary info ============================
FAILED tests/test_sklearn_decomposition.py::TestSklearnMatrixDecomposition::test_pca_converter_mle_whiten - sklearn.utils._param_validation.InvalidParameterError: The 'whiten' parameter of PCA must be an instance of 'bool', an instance of 'numpy.bool_' or an instance of 'int'. Got 'arbitrary-variance' instead.
FAILED tests/test_sklearn_normalizer_converter.py::TestSklearnNormalizer::test_normalizer_converter_raises_wrong_type - sklearn.utils._param_validation.InvalidParameterError: The 'norm' parameter of Normalizer must be a str among {'l1', 'max', 'l2'}. Got 'invalid' instead.
===== 2 failed, 356 passed, 272 skipped, 71 warnings in 184.05s (0:03:04) ======

@ksaur ksaur changed the title s/boston/cali/g Changes for SKL1.2 Dec 8, 2022
@ksaur
Copy link
Collaborator Author

ksaur commented Dec 8, 2022

Fixed those two.

And now:

=========================== short test summary info ===========================
FAILED tests/test_backends.py::TestBackends::test_onnx_no_test_data_double - AttributeError: 'DecisionTreeRegressor' object has no attribute 'n_features_'
FAILED tests/test_backends.py::TestBackends::test_onnx_no_test_data_float - AttributeError: 'DecisionTreeRegressor' object has no attribute 'n_features_'
FAILED tests/test_extra_conf.py::TestExtraConf::test_onnx_deafault_n_threads - AttributeError: 'DecisionTreeRegressor' object has no attribute 'n_features_'
FAILED tests/test_extra_conf.py::TestExtraConf::test_pandas_batch_onnxml - AttributeError: 'DecisionTreeRegressor' object has no attribute 'n_features_'
===== 4 failed, 522 passed, 104 skipped, 94 warnings in 287.73s (0:04:47) =====

@ksaur
Copy link
Collaborator Author

ksaur commented Dec 8, 2022

@interesaaat this last one seems to be a conflict with sklearn-onnx and SKL1.2. Double check and we'll open an issue?

The error 'DecisionTreeRegressor' object has no attribute 'n_features_' is in sklearn-onnx convert_sklearn_gradient_boosting_classifier

And from skl:
n_features_ is deprecated in 1.0 and will be removed in 1.2.

@ksaur
Copy link
Collaborator Author

ksaur commented Dec 8, 2022

@interesaaat
Copy link
Collaborator

@interesaaat this last one seems to be a conflict with sklearn-onnx and SKL1.2. Double check and we'll open an issue?

The error 'DecisionTreeRegressor' object has no attribute 'n_features_' is in sklearn-onnx convert_sklearn_gradient_boosting_classifier

And from skl: n_features_ is deprecated in 1.0 and will be removed in 1.2.

Oh I thought that we already fixed this. Let me check.

@interesaaat
Copy link
Collaborator

Yea we fixed in our code but probably sklern-onnx has not upgraded yet. Good job finding this!

@ksaur
Copy link
Collaborator Author

ksaur commented Dec 8, 2022

Yes thanks to @mshr-h in #610 we had n_features_ already deprecated on our side!

@mshr-h
Copy link
Collaborator

mshr-h commented Dec 9, 2022

I've checked load_bostom() API docs and it says there're two alternatives for that, fetch_california_housing() and fetch_openml(name="house_prices", as_frame=True) which is called Ames housing dataset.
The former has 20,640 samples in total. But the latter has 506 which is identical to the Boston dataset.
I guess the Ames housing dataset is a better alternative for us because we want to make sure the output of the HB model is close enough to the output of the SKL model.

@ksaur
Copy link
Collaborator Author

ksaur commented Dec 9, 2022

@mshr-h I tried the Ames dataset but it has various types, objects:

>>> X_train.dtypes
Id             float64
MSSubClass     float64
MSZoning        object
LotFrontage    float64
                ...   
MiscFeature     object

and i want to keep it simple. rather than encoding, etc. Boston was all float64 and so is Cali.

In the meantime, my onnx/sklearn-onnx#952 was merged (but not released yet).
@interesaaat should we try to change the pipeline to pull from head to get the change?

@ksaur
Copy link
Collaborator Author

ksaur commented Dec 9, 2022

I thought these all passed fine (works on my machine!) but it seems with the new skl2onnx we have one more:

TestONNXSVC.test_logistic_regression_onnxml_multi_torch

        for name in names:
>           new_names.append(map[name])
E           KeyError: 'probabilities'`

@interesaaat interesaaat merged commit dc24aa3 into main Dec 9, 2022
@ksaur ksaur deleted the kasaur/datasetfix branch December 9, 2022 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants