-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CatBoost converter #392
CatBoost converter #392
Conversation
Thanks for the contribution, @monkey0head! |
doc_string='test binary classification') | ||
self.assertTrue(catboost_onnx is not None) | ||
# onnx runtime returns zeros as class labels | ||
# dump_data_and_model(X.astype(numpy.float32), catboost_model, catboost_onnx, basename="CatBoostBinClass") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be uncommented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this part has a problem :(
The comparison works properly with probabilities, not with labels. A converted model returns only zeros as labels. I consulted the Catboost team and they consider it as onnxruntime bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be fixed and it is probably an error somewhere in the onnx graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the information, I reported your reply to the Catboost team members and I will update my pr after they fix the bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works now with the new onnxruntime version
onnxmltools/convert/main.py
Outdated
'Please install/upgrade CatBoost to use this feature.') | ||
|
||
if custom_conversion_functions: | ||
warnings.warn('custom_conversion_functions is not supported. Please set it to None.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why include these converter arguments if they are not supported? It might be better to remove the arguments entirely. In the code above for the keras converter, these arguments were deprecated, which is why the warning messages were necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought all convertors have pretty the same interface and thus added the args). I have discussed the matter with the member of CatBoost team. I will create a pr to change CatBoost converter interface to pass those args to the CatBoost's side. CatBoost team may implement the functionality in the future. I will update my pr when the change is released if it is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature in onnxmltools is not always the same. Only in sklearn-onnx. So I would either remove the parameter either raise an exception if the parameter is not None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I removed arguments that are not supported
onnxmltools/convert/main.py
Outdated
try: | ||
from catboost.utils import convert_to_onnx_object | ||
except ImportError: | ||
raise RuntimeError('CatBoost is not installed or need to be updated. ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "needs to be updated."
onnxmltools/convert/main.py
Outdated
warnings.warn('custom_shape_calculators is not supported. Please set it to None.') | ||
|
||
export_parameters = { | ||
'onnx_domain': 'ai.catboost', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using existing domains?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! You are right, I will change it to the ai.onnx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
onnxmltools/convert/main.py
Outdated
@@ -44,6 +44,29 @@ def convert_libsvm(model, name=None, initial_types=None, doc_string='', target_o | |||
custom_conversion_functions, custom_shape_calculators) | |||
|
|||
|
|||
def convert_catboost(model, name=None, initial_types=None, doc_string='', target_opset=None, | |||
targeted_onnx=onnx.__version__, custom_conversion_functions=None, custom_shape_calculators=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other converters keeps arguments like "targeted_onnx=onnx.version, custom_conversion_functions=None, custom_shape_calculators=None" for the backward compatibility, if there is a brand new one, these arguments could be dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, did so
onnxmltools/utils/tests_helper.py
Outdated
@@ -212,6 +212,9 @@ def convert_model(model, name, input_types): | |||
model, prefix = convert_lightgbm(model, name, input_types), "LightGbm" | |||
else: | |||
raise RuntimeError("Unable to convert model of type '{0}'.".format(type(model))) | |||
elif model.__class__.__name__.startswith("Cat"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any better fingerprint to identify the original model?
onnxmltools/convert/main.py
Outdated
'onnx_graph_name': name | ||
} | ||
|
||
return convert_to_onnx_object(model, export_parameters=export_parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need take care of the target_opset argument, which specify what's the opset version will be used in the generated ONNX model.
If you plan to only support one target_opset currently, you need check target_opset and report an issue if the user target_opset is not as same as the one that is support in Catboost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, now I pass the target_opset to Catboost and check it there
There was a problem with binary classification: #392 (comment) The converter for binary classification works well now with the new version of onnxruntime (tested locally and tests for rt130 are passed here) but fails for the older versions. |
Hi! Could you please have a look at the updated pull request? Regarding the binary classification issue: The converter for binary classification works well with the new version of onnxruntime. The checks fail on the older versions. |
I saw some failure in CI pipeline which blocks your PR merging. Can you fix them before merge the PR? |
@wenbingl, sure. Fixed! |
About adding catboost, see https://github.com/onnx/onnx/blob/master/community/logo_request.md. |
Hi! I want to add CatBoost models conversion functionality.
CatBoost has its own converter to ONNX-ML and I added an interface to convert CatBoost models with onnxmltools as it would be more convenient for users to be able to convert CatBoost models in the same way as models created with other popular ml toolkits.