Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for passthrough connection to stacking estimator converter #930

Merged
merged 3 commits into from
Nov 22, 2022

Conversation

OKUA1
Copy link
Contributor

@OKUA1 OKUA1 commented Oct 30, 2022

As described in sklearn documentation, both StackingClassifier and StackingRegressor accept a passthrough argument, which (when set to True) allows to pass the original input data to the final estimator creating something similar to a skip connection.

This functionality is not realized in current converter which makes a converted model invalid in cases when passthrough was set to True during training.

@OKUA1 OKUA1 force-pushed the stacking_est_passthrough_conn branch from 17036f6 to cc95d9c Compare October 30, 2022 09:05
Signed-off-by: Oleg Kostromin <kostromin97@gmail.com>
Signed-off-by: Oleg Kostromin <kostromin97@gmail.com>
@OKUA1 OKUA1 force-pushed the stacking_est_passthrough_conn branch from 3cc27d4 to a2b25c1 Compare October 30, 2022 09:34
@xadupre
Copy link
Collaborator

xadupre commented Nov 2, 2022

Is it possible to add a unit test for it?

@OKUA1
Copy link
Contributor Author

OKUA1 commented Nov 2, 2022

@xadupre Yes, working on it right now. Quick question: when testing the regressor, some of the predictions are above the threshold causing the test to fail (the unit test for classification passes without issues). Could it be caused simply by float64->float32 conversion or should I look for another cause? And if so, do you have some advice how to properly debug it?

The test output:

--expected--output--

Arrays are not almost equal to 5 decimals

Mismatched elements: 11 / 125 (8.8%)
Max absolute difference: 2.32062132e-05
Max relative difference: 4.41075741e-06
onnx==1.12.0 onnxruntime==1.12.1 TARGET_OPSET=17
onnx==1.12.0 onnxruntime==1.12.1 TARGET_OPSET=17
onnx==1.12.0 onnxruntime==1.12.1 TARGET_OPSET=17

----------------------------------------------------------------------

Signed-off-by: Oleg Kostromin <kostromin97@gmail.com>
@OKUA1
Copy link
Contributor Author

OKUA1 commented Nov 3, 2022

I added several unit tests which match current ones, but with passthrough = True.

As mentioned earlier, the regression was/is a bit problematic one with deviation in the predictions of onnx/sklearn being slightly above the desired threshold for a small subset of tested samples. After further investigation I did not get any ideas what the cause could be except the originally assumed discrepancy due to the usage of float32 in onnx. For now, I used factor=0.1 in the respective unit test which prevents it from failing but probably not the best thing to do.

@ignore_warnings(category=FutureWarning)
def test_concat_stacking_passthrough(self):

class CustomTransformer:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why not inheriting from Estimator, TransformerMixin instead of implementing a new parser?

@xadupre xadupre merged commit 568375d into onnx:main Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants