Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# Sample Error with Phi 3.5 GPU/DML model - Non-zero status code returned while running SkipSimplifiedLayerNormalization node #1074

Closed
AshD opened this issue Nov 19, 2024 · 4 comments
Labels

Comments

@AshD
Copy link

AshD commented Nov 19, 2024

Describe the bug
Running https://github.com/microsoft/onnxruntime-genai/tree/main/examples/csharp/HelloPhi3V sample
with https://huggingface.co/microsoft/Phi-3.5-vision-instruct-onnx/tree/main/gpu/gpu-int4-rtn-block-32

throws Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Non-zero status code returned while running SkipSimplifiedLayerNormalization node. Name:'/model/layers.0/post_attention_layernorm/SkipLayerNorm' Status Message: D:\a_work\1\s\include\onnxruntime\core/framework/op_kernel_context.h:42 onnxruntime::OpKernelContext::Input Missing Input: model.layers.0.post_attention_layernorm.weight
'
on generator.ComputeLogits();

Is the above model not compatible with Microsoft.ML.OnnxRuntimeGenAI?

The CPU model seems to work fine.
https://huggingface.co/microsoft/Phi-3.5-vision-instruct-onnx/tree/main/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4

I am looking to use the DML model for Phi-3.5 vision. I assumed this was it. https://huggingface.co/microsoft/Phi-3.5-vision-instruct-onnx/tree/main/gpu/gpu-int4-rtn-block-32

To Reproduce
Steps to reproduce the behavior:

  1. Run the HelloPhi3V sample
  2. Throws above exception
  3. Replace with Microsoft.ML.OnnxRuntimeGenAI.DirectML nuget package
  4. Still throws the above exception

Desktop (please complete the following information):

  • OS: Windows 11, .NET 9, Visual Studio 2022 latest
  • Microsoft.ML.OnnxRuntimeGenAI.DirectML v0.51 or Microsoft.ML.OnnxRuntimeGenAI v0.51 packages
@kunal-vaishnavi
Copy link
Contributor

throws Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Non-zero status code returned while running SkipSimplifiedLayerNormalization node. Name:'/model/layers.0/post_attention_layernorm/SkipLayerNorm' Status Message: D:\a_work\1\s\include\onnxruntime\core/framework/op_kernel_context.h:42 onnxruntime::OpKernelContext::Input Missing Input: model.layers.0.post_attention_layernorm.weight

There's a known ONNX Runtime regression for SkipSimplifiedLayerNormalization with v1.20.0. You can downgrade to an older ONNX Runtime version until a patch is released after MS Ignite or use a nightly ONNX Runtime version to resolve this.

I am looking to use the DML model for Phi-3.5 vision. I assumed this was it. https://huggingface.co/microsoft/Phi-3.5-vision-instruct-onnx/tree/main/gpu/gpu-int4-rtn-block-32

In case you run into DML issues with ONNX Runtime GenAI v0.5.1, there's also a known ONNX Runtime GenAI regression specific to DML. The fix has been merged here. You can downgrade to ONNX Runtime GenAI v0.5.0, build from source, or wait until a patch is released after MS Ignite.

@AshD
Copy link
Author

AshD commented Nov 19, 2024

Thanks @kunal-vaishnavi version 0.5.0 fixes this issue.

@AshD AshD closed this as completed Nov 19, 2024
@AshD
Copy link
Author

AshD commented Nov 25, 2024

@kunal-vaishnavi It is working now with the GPU model and the updated packages. But the CPU (Core i9 13Gen) runs at 70% while inferencing with an image. The model is loaded and I using the HelloPhi3V sample to test. Is this expected?

@kunal-vaishnavi
Copy link
Contributor

You can tune performance using ONNX Runtime's SessionOptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants