Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomForestClassifier: ONNX output converts all BRANCH_LEQ to BRANCH_EQ #688

Closed
tiago-rib-goncalves opened this issue Mar 26, 2024 · 1 comment

Comments

@tiago-rib-goncalves
Copy link
Contributor

tiago-rib-goncalves commented Mar 26, 2024

Hey,

After conversion of pyspark.ml RandomForestClassifier to ONNX, when infering with InferenceSession from onnxruntime, all the output probabilities (and, thus, predictions) were the same.

While comparing the output from the ONNX model with the sparkml model, it was noticed that the ONNX model assigned all BRANCH_LEQ nodes to BRANCH_EQ.

Tracing back this issue to onnxmltools/convert/sparkml/operator_converters/tree_helper.py, function rewrite_ids_and_process, it is possible to see that this is happening since the node values for these nodes are contained in an array (L282).

Going further back, this is due to the function sparkml_tree_dataset_to_sklearn in onnxmltools/convert/sparkml/operator_converters/tree_ensemble_common.py. In here, issue appears to be in L31:
threshold.append(item["leftCategoriesOrThreshold"])
which is retrieving an array.

Is there any reason to not using:
threshold.append(item["leftCategoriesOrThreshold"][0] if len(item["leftCategoriesOrThreshold"]) >= 1 else -1.0)
just like it is done in L37 for the tuple case:
threshold.append(item[1][0] if len(item[1]) >= 1 else -1.0)
?

Thank you :)
Best regards,
Tiago

@tiago-rib-goncalves tiago-rib-goncalves changed the title RandomForestClassifier to ONNX: conversion succeeded but retrieves same probability on inference for all observations RandomForestClassifier: ONNX output converts all BRANCH_LEQ to BRANCH_EQ Apr 9, 2024
tiago-rib-goncalves added a commit to tiago-rib-goncalves/onnxmltools that referenced this issue Apr 11, 2024
See issue onnx#688

Signed-off-by: tiago-rib-goncalves <159172975+tiago-rib-goncalves@users.noreply.github.com>
xadupre pushed a commit that referenced this issue May 17, 2024
See issue #688

Signed-off-by: tiago-rib-goncalves <159172975+tiago-rib-goncalves@users.noreply.github.com>
@xadupre
Copy link
Collaborator

xadupre commented May 20, 2024

The PR was merged, I'll close this issue.

@xadupre xadupre closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants