Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds ONNX export support to Tribuo's LinearSGDModels #154

Merged
merged 15 commits into from
Aug 18, 2021
Merged

Conversation

Craigacp
Copy link
Member

@Craigacp Craigacp commented Jul 27, 2021

Description

This adds support for exporting LinearSGDModels (classification, regression and multi-label) to ONNX format. The ONNX support will grow over the next few PRs as we add support for other Tribuo models, though it's likely that XGBoost and TensorFlow models will require the use of the Python converter packages to emit ONNX format.

This involves several changes:

  • tribuo-multilabel-core no longer depends on tribuo-classification-sgd at test time, instead the dummy classifiers from tribuo-classification-core are used in the tests. This breaks a circular dependency between tribuo-onnx, tribuo-multilabel-core and tribuo-classification-sgd.
  • tribuo-onnx now has a MultiLabelTransformer to allow the loading of multi-label ONNX models.
  • tribuo-core now depends on protobuf and has the OnnxMl java file compiled from ONNX v1.9.0's protobuf in ai.onnx.proto.
  • The ONNX export support lives in org.tribuo.onnx in tribuo-core, and is focused around ONNXOperators which allows the construction of ONNX NodeProtos for the specified operations.
  • In tribuo-common-sgd some helpers have been added to AbstractLinearSGDModel. This might move to AbstractSGDModel when we add ONNX support to factorization machines.
  • Finally each LinearSGDModel now implements ONNXExportable which adds methods to create a ModelProto which encapsulates the whole model, a GraphProto which represents the model computation, and a save method which can write out an ONNX file.

As we expand the coverage to ensemble methods the ONNXExportable interface might change slightly, and the specific naming of the input and output nodes is likely to change (which might induce some method signature changes). I'd like to land this chunk first though as a single PR for the whole of Tribuo will be far too large to review.

One further thing that I'm considering is storing the provenance in a machine readable format inside the ONNX model, and having ONNXExternalModel expose that provenance, but I don't want to have everything depend on Jackson and protobuf so the current JSON based provenance string is not ideal.

Motivation

We'd like to export Tribuo models for use in other environments, and ONNX is a popular interchange format with a friendly license that we already support model import from.

@Craigacp Craigacp added the Oracle employee This PR is from an Oracle employee label Jul 27, 2021
Copy link
Member

@JackSullivan JackSullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@Craigacp Craigacp merged commit 3942c87 into main Aug 18, 2021
@Craigacp Craigacp deleted the onnx-export branch August 21, 2021 01:39
This was referenced Aug 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Oracle employee This PR is from an Oracle employee
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants