E2E HuggingFace Bert using LTC Backend #912

antoniojkim · 2022-06-07T16:43:28Z

We finally have the full training graph for HuggingFace's implementation of Bert working using the LTC Backend!

The torch_mlir_ltc_backend staging branch still won't be ready to merge into main after this PR though. We are still waiting on a couple things in upstream PyTorch to be merged in:

Plus, there are a few more minor changes I'd like to make before we merge it in that will help make it easier to maintain.

CC: @ke1337 @henrytwo

- Add empty_strided and as_strided - Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy) - Check for composite implicit ops and add device data IR - Also fix codegen for functionalization

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels. - Implemented compute_shape_mul - Fixed bug in reshape shape inference error message

- Remove LazyNativeFunctions::_unsafe_view from autogen - Blacklist ops to make JIT graph more like output of TS backend - Print graph when SSA value has mismatch of types and results - Remove normalize_index from LazyShapeInference - Fix seeds for LTC example models

- Prune shape inference functions - Add shape inference function for GenerateSlice - Add shape inference function for GenerateCopy

henrytwo · 2022-06-07T16:57:23Z

Note that for this PR, we are executing on the JIT graph (which we confirmed matches exec on CPU) and not the emitted MLIR. We're operating under the assumption that torch-mlir is doing its job in emitting MLIR that accurately represents the JIT graph.

In a future PR, we will lower the MLIR to linalg and execute on a reference backend to validate that the MLIR itself is able to produce the correct numerics.

silvasean · 2022-06-07T18:16:45Z

Awesome!!! 🥳

Is there anything in particular you would like me to look at in this big PR? Otherwise I trust you folks and LGTM.

antoniojkim · 2022-06-07T18:38:46Z

Is there anything in particular you would like me to look at in this big PR? Otherwise I trust you folks and LGTM.

Nothing in particular. The changes made are quite isolated to the LTC backend and won't affect any other part of Torch-MLIR.

Thanks for the vote of confidence!

* Update native function definitions * Add ops to support bert lowering - Add empty_strided and as_strided - Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy) - Check for composite implicit ops and add device data IR - Also fix codegen for functionalization * Add autogen to CMakeList * Remove PyTorch submodule * Reduced BERT model size * Print Mark Step status in Torch MLIR LTC debug string * Apply fixes to work with latest upstream/main - Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor * Update shape inference functions - Fixed compute_shape_native_batch_norm when mean and var are uninitialized Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels. - Implemented compute_shape_mul - Fixed bug in reshape shape inference error message * Get MLIR backend more consistent with TS backend - Remove LazyNativeFunctions::_unsafe_view from autogen - Blacklist ops to make JIT graph more like output of TS backend - Print graph when SSA value has mismatch of types and results - Remove normalize_index from LazyShapeInference - Fix seeds for LTC example models * Update and clean up shape inference functions - Prune shape inference functions - Add shape inference function for GenerateSlice - Add shape inference function for GenerateCopy Co-authored-by: Henry Tu <henry.tu@cerebras.net>

* Add links to onnxai Signed-off-by: Tung D. Le <tung@jp.ibm.com> * Reduce spaces on the left and righ sides Signed-off-by: Tung D. Le <tung@jp.ibm.com> Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com>

antoniojkim and others added 10 commits June 6, 2022 12:10

Update native function definitions

80f61a1

Add autogen to CMakeList

31b2d33

Remove PyTorch submodule

a601b9e

Reduced BERT model size

950e0ec

Print Mark Step status in Torch MLIR LTC debug string

0187a26

Apply fixes to work with latest upstream/main

dac0269

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

Update and clean up shape inference functions

ec3c9a9

- Prune shape inference functions - Add shape inference function for GenerateSlice - Add shape inference function for GenerateCopy

antoniojkim requested review from powderluv, ramiro050 and silvasean June 7, 2022 16:43

antoniojkim self-assigned this Jun 7, 2022

henrytwo self-assigned this Jun 7, 2022

antoniojkim merged commit 80a8f8a into llvm:torch_mlir_ltc_backend Jun 7, 2022

tanyokwok mentioned this pull request Sep 21, 2022

features/bladedisc rebase 20220830 pai-disc/torch-mlir#20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2E HuggingFace Bert using LTC Backend #912

E2E HuggingFace Bert using LTC Backend #912

antoniojkim commented Jun 7, 2022

henrytwo commented Jun 7, 2022

silvasean commented Jun 7, 2022 •

edited

Loading

antoniojkim commented Jun 7, 2022

E2E HuggingFace Bert using LTC Backend #912

E2E HuggingFace Bert using LTC Backend #912

Conversation

antoniojkim commented Jun 7, 2022

henrytwo commented Jun 7, 2022

silvasean commented Jun 7, 2022 • edited Loading

antoniojkim commented Jun 7, 2022

silvasean commented Jun 7, 2022 •

edited

Loading