-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fp16] model generates NaN results on fp16, while it generates correct results on fp32 #11384
Comments
Sorry, I was using a wrong permission on the shared file. Now, it has been changed to the right permission. Please retry if you met permission issue on viewing the shared file. Thanks. |
@yetingqiaqia it's somewhat suspicious that the If neither of those work, please file an issue in https://github.com/microsoft/onnxconverter-common and CC me and xiaowuhu. |
Hi @yetingqiaqia, please resort to this tool https://github.com/microsoft/onnxconverter-common/blob/master/onnxconverter_common/auto_mixed_precision.py to properly convert fp32 model to fp16 model. |
@BowenBao should we delete onnxmltools.utils.float16_converter? |
Indeed it feels more reasonable to replace the lower level Created onnx/onnxmltools#543 |
Thanks @BowenBao .
Example code: def convert_float32_to_mixed_precision(fp32_model_path, mixed_precision_model_path):
from onnxconverter_common import auto_mixed_precision
import onnx
model = onnx.load(fp32_model_path)
import numpy as np
np.random.seed(123)
test_data = {"input_image": 2*np.random.rand(8, 3, 384, 384).astype(np.float32)-1.0}
# Could also use rtol/atol attributes directly instead of this
def validate(res1, res2):
for r1, r2 in zip(res1, res2):
if not np.allclose(r1, r2, rtol=0.01, atol=0.001):
return False
return True
model_fp16 = auto_mixed_precision.auto_convert_mixed_precision(model, test_data, validate, keep_io_types=True)
onnx.save(model_fp16, mixed_precision_model_path)
fp32_model_path = './ConvNext-XL/graph.onnx'
mixed_precision_model_path = './ConvNext-XL_mixed_precision/graph_mixed_precision.onnx'
print("Convert to mixed precision starts...")
convert_float32_to_mixed_precision(fp32_model_path, mixed_precision_model_path)
print("Conversion finished.")
So, if |
Thanks @garymm . float32 as input is by purpose, which shouldn't bring in the For deleting the |
Let's continue the discussion of what to do about the two APIs in onnx/onnxmltools#543. |
Describe the bug
Hi ORT team,
We use fp16 to accelerate the model inference speed. It works fine on many models. But we met NaN issue on a new fp16 model, while its fp32 version generates correct results. See below:
Could you help check if there anything wrong on the fp16 model?
Urgency
The onnx model was converted from pyTorch model. Its fp32 model speed is slower than pyTorch. We hope fp16 could help accelerate its speed by 3x, if resolving this NaN issue. Thanks.
System information
BTW, the issue can be reproduced in different CUDA/cuDNN versions or GPU SKUs, so I don't think they matter.
To Reproduce
https://drive.google.com/file/d/1EykXYJLcL8EeCtTjGM_MVdhkhJXsSagM/view?usp=sharing
It includes
fp32 model: ConvNext-XL
fp16 model: ConvNext-XL-fp16
test script: ConvNextXL_fp16_test.py
The text was updated successfully, but these errors were encountered: