Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend float16 converter api with auto_mixed_precision #543

Merged
merged 1 commit into from
May 9, 2022

Conversation

BowenBao
Copy link
Contributor

No description provided.

@lgtm-com
Copy link

lgtm-com bot commented Apr 28, 2022

This pull request introduces 1 alert when merging 1fe71a6 into 92caa72 - view on LGTM.com

new alerts:

  • 1 for 'import *' may pollute namespace

@garymm
Copy link

garymm commented Apr 28, 2022

If convert_float_to_float16_model_path is less reliable than auto_convert_mixed_precision, then I think we should delete the former so users don't accidentally use something that doesn't work.
Thoughts?

@yetingqiaqia
Copy link

Hi @garymm, I was the original author of the convert_float_to_float16_model_path() function. This function was created based on Tom's convert_float_to_float16() function in the same float16.py file. Its original purpose is to support >2GB models which couldn't support by convert_float_to_float16() function.

We used this function a lot in our scenarios. We maintain a DL platform called AdsBrain which serves lots of users' models within Microsoft (mainly for Microsoft Ads, and also have users in MSR, Search and News, etc.).

I have two concerns for deleting this function.
(1) Does auto_convert_mixed_precision() also support >2GB models?
(2) The speed concern, this auto_convert_mixed_precision() api seems to be slow. It seems to scan a range of combinations to find the optimal one. I have a test on one model in this ticket, it attempted 52 times, which took 16mins to finish, while convert_float_to_float16_model_path() and convert_float_to_float16() functions are quick, which only took several seconds. I am not sure if auto_convert_mixed_precision() can be further speed-up. Otherwise, I would like to ask to keep both APIs.
Thanks.

@BowenBao
Copy link
Contributor Author

My thoughts are these are apis on two different levels. Essentially the main fp16 conversion problem is that some (pt) operators don't have (fp16) implementation in onnx, thus directly converting the converted onnx subgraph to fp16 may result in under/overflow, which lead to issues such as the nan issue you observed.

auto_convert_mixed_precision is on a higher, more public facing level. It utilizes graph search and validation to ensure the fp16 conversion does not incur under/overflow. At the cost of conversion speed.

convert_float_to_float16() can be considered as lower level api, serving the purpose of simply updating the dtype. It is the caller's responsibility to ensure the conversion is semantically correct, and the model won't trigger under/overflow.

@yetingqiaqia for your concern, I think auto_convert_mixed_precision can be improved to support >2GB model, if not already supported.
For the performance concern, if it takes a long time to complete, usually it implies the model indeed incurs under/overflow problem. Otherwise, it can finish with one pass of the model, and the time cost would be very close to one model execution time.

@yetingqiaqia
Copy link

Thanks @BowenBao. I totally agree.

@garymm
Copy link

garymm commented Apr 29, 2022

Given that, does it make sense to

  • Ensure auto_convert_mixed_precision supports large models and
  • Ensure auto_convert_mixed_precision runs reasonably fast when there are no underflow / overflow issues, then
  • Stop exposing convert_float_to_float16* as part of onnxmltools

?

@BowenBao
Copy link
Contributor Author

BowenBao commented May 2, 2022

Created issue #544 to track next steps.

@garymm
Copy link

garymm commented May 4, 2022

@xadupre could you please review?

Signed-off-by: BowenBao <bowbao@microsoft.com>
@lgtm-com
Copy link

lgtm-com bot commented May 5, 2022

This pull request introduces 1 alert when merging 968a7f1 into 92caa72 - view on LGTM.com

new alerts:

  • 1 for 'import *' may pollute namespace

@BowenBao
Copy link
Contributor Author

BowenBao commented May 9, 2022

@xadupre could you please merge?

@xadupre xadupre merged commit f87e306 into onnx:main May 9, 2022
@xiaowuhu
Copy link
Collaborator

@yetingqiaqia Hi Ting, I saw you shared the model in google drive, ConvNext-ML_mldelAndScript.zip. is this model > 2G?

@yetingqiaqia
Copy link

Thanks @BowenBao @xiaowuhu and @garymm . It turns out this auto_convert_mixed_precision() indeed doesn't support >2GB models. Our customers reported an urgent case to us which blocks their mainstream plan. When convert, we met error ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 8002093824. Could you please have a look and fix it ASAP? Much appreciated!
Bug report: microsoft/onnxconverter-common#215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants