Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E HuggingFace Bert using LTC Backend #912

Merged

Conversation

antoniojkim
Copy link
Collaborator

We finally have the full training graph for HuggingFace's implementation of Bert working using the LTC Backend!

The torch_mlir_ltc_backend staging branch still won't be ready to merge into main after this PR though. We are still waiting on a couple things in upstream PyTorch to be merged in:

Plus, there are a few more minor changes I'd like to make before we merge it in that will help make it easier to maintain.

CC: @ke1337 @henrytwo

antoniojkim and others added 10 commits June 6, 2022 12:10
- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization
- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor
- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message
- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models
- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy
@antoniojkim antoniojkim self-assigned this Jun 7, 2022
@henrytwo henrytwo self-assigned this Jun 7, 2022
@henrytwo
Copy link
Member

henrytwo commented Jun 7, 2022

Note that for this PR, we are executing on the JIT graph (which we confirmed matches exec on CPU) and not the emitted MLIR. We're operating under the assumption that torch-mlir is doing its job in emitting MLIR that accurately represents the JIT graph.

In a future PR, we will lower the MLIR to linalg and execute on a reference backend to validate that the MLIR itself is able to produce the correct numerics.

@silvasean
Copy link
Contributor

silvasean commented Jun 7, 2022

Awesome!!! 🥳

Is there anything in particular you would like me to look at in this big PR? Otherwise I trust you folks and LGTM.

@antoniojkim
Copy link
Collaborator Author

Is there anything in particular you would like me to look at in this big PR? Otherwise I trust you folks and LGTM.

Nothing in particular. The changes made are quite isolated to the LTC backend and won't affect any other part of Torch-MLIR.

Thanks for the vote of confidence!

@antoniojkim antoniojkim merged commit 80a8f8a into llvm:torch_mlir_ltc_backend Jun 7, 2022
antoniojkim added a commit that referenced this pull request Jun 30, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
antoniojkim added a commit that referenced this pull request Jun 30, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
antoniojkim added a commit that referenced this pull request Jul 5, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
antoniojkim added a commit that referenced this pull request Jul 7, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
henrytwo added a commit that referenced this pull request Jul 8, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
henrytwo added a commit that referenced this pull request Jul 8, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
henrytwo added a commit that referenced this pull request Jul 12, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
antoniojkim added a commit that referenced this pull request Jul 15, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
antoniojkim added a commit that referenced this pull request Jul 19, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
antoniojkim added a commit that referenced this pull request Jul 22, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
henrytwo added a commit that referenced this pull request Jul 29, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
henrytwo added a commit that referenced this pull request Jul 29, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
henrytwo added a commit that referenced this pull request Jul 30, 2022
* Update native function definitions

* Add ops to support bert lowering

- Add empty_strided and as_strided

- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)

- Check for composite implicit ops and add device data IR

- Also fix codegen for functionalization

* Add autogen to CMakeList

* Remove PyTorch submodule

* Reduced BERT model size

* Print Mark Step status in Torch MLIR LTC debug string

* Apply fixes to work with latest upstream/main

- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode

  Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor

* Update shape inference functions

- Fixed compute_shape_native_batch_norm when mean and var are uninitialized

  Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.

- Implemented compute_shape_mul

- Fixed bug in reshape shape inference error message

* Get MLIR backend more consistent with TS backend

- Remove LazyNativeFunctions::_unsafe_view from autogen

- Blacklist ops to make JIT graph more like output of TS backend

- Print graph when SSA value has mismatch of types and results

- Remove normalize_index from LazyShapeInference

- Fix seeds for LTC example models

* Update and clean up shape inference functions

- Prune shape inference functions

- Add shape inference function for GenerateSlice

- Add shape inference function for GenerateCopy

Co-authored-by: Henry Tu <henry.tu@cerebras.net>
qedawkins pushed a commit to nod-ai/torch-mlir that referenced this pull request Oct 3, 2022
* Add links to onnxai

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

* Reduce spaces on the left and righ sides

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

Co-authored-by: Alexandre Eichenberger <alexe@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants