Extend float16 converter api with auto_mixed_precision #543

BowenBao · 2022-04-28T18:53:09Z

No description provided.

lgtm-com · 2022-04-28T19:15:01Z

This pull request introduces 1 alert when merging 1fe71a6 into 92caa72 - view on LGTM.com

new alerts:

1 for 'import *' may pollute namespace

garymm · 2022-04-28T22:11:27Z

If convert_float_to_float16_model_path is less reliable than auto_convert_mixed_precision, then I think we should delete the former so users don't accidentally use something that doesn't work.
Thoughts?

yetingqiaqia · 2022-04-29T00:33:55Z

Hi @garymm, I was the original author of the convert_float_to_float16_model_path() function. This function was created based on Tom's convert_float_to_float16() function in the same float16.py file. Its original purpose is to support >2GB models which couldn't support by convert_float_to_float16() function.

We used this function a lot in our scenarios. We maintain a DL platform called AdsBrain which serves lots of users' models within Microsoft (mainly for Microsoft Ads, and also have users in MSR, Search and News, etc.).

I have two concerns for deleting this function.
(1) Does auto_convert_mixed_precision() also support >2GB models?
(2) The speed concern, this auto_convert_mixed_precision() api seems to be slow. It seems to scan a range of combinations to find the optimal one. I have a test on one model in this ticket, it attempted 52 times, which took 16mins to finish, while convert_float_to_float16_model_path() and convert_float_to_float16() functions are quick, which only took several seconds. I am not sure if auto_convert_mixed_precision() can be further speed-up. Otherwise, I would like to ask to keep both APIs.
Thanks.

BowenBao · 2022-04-29T00:52:27Z

My thoughts are these are apis on two different levels. Essentially the main fp16 conversion problem is that some (pt) operators don't have (fp16) implementation in onnx, thus directly converting the converted onnx subgraph to fp16 may result in under/overflow, which lead to issues such as the nan issue you observed.

auto_convert_mixed_precision is on a higher, more public facing level. It utilizes graph search and validation to ensure the fp16 conversion does not incur under/overflow. At the cost of conversion speed.

convert_float_to_float16() can be considered as lower level api, serving the purpose of simply updating the dtype. It is the caller's responsibility to ensure the conversion is semantically correct, and the model won't trigger under/overflow.

@yetingqiaqia for your concern, I think auto_convert_mixed_precision can be improved to support >2GB model, if not already supported.
For the performance concern, if it takes a long time to complete, usually it implies the model indeed incurs under/overflow problem. Otherwise, it can finish with one pass of the model, and the time cost would be very close to one model execution time.

yetingqiaqia · 2022-04-29T01:06:40Z

Thanks @BowenBao. I totally agree.

garymm · 2022-04-29T01:13:31Z

Given that, does it make sense to

Ensure auto_convert_mixed_precision supports large models and
Ensure auto_convert_mixed_precision runs reasonably fast when there are no underflow / overflow issues, then
Stop exposing convert_float_to_float16* as part of onnxmltools

?

BowenBao · 2022-05-02T19:17:16Z

Created issue #544 to track next steps.

garymm · 2022-05-04T18:27:01Z

@xadupre could you please review?

Signed-off-by: BowenBao <bowbao@microsoft.com>

lgtm-com · 2022-05-05T17:37:00Z

This pull request introduces 1 alert when merging 968a7f1 into 92caa72 - view on LGTM.com

new alerts:

1 for 'import *' may pollute namespace

BowenBao · 2022-05-09T17:33:14Z

@xadupre could you please merge?

xiaowuhu · 2022-05-25T00:04:23Z

@yetingqiaqia Hi Ting, I saw you shared the model in google drive, ConvNext-ML_mldelAndScript.zip. is this model > 2G?

yetingqiaqia · 2022-05-31T21:30:31Z

Thanks @BowenBao @xiaowuhu and @garymm . It turns out this auto_convert_mixed_precision() indeed doesn't support >2GB models. Our customers reported an urgent case to us which blocks their mainstream plan. When convert, we met error ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 8002093824. Could you please have a look and fix it ASAP? Much appreciated!
Bug report: microsoft/onnxconverter-common#215

BowenBao mentioned this pull request Apr 28, 2022

[fp16] model generates NaN results on fp16, while it generates correct results on fp32 microsoft/onnxruntime#11384

Closed

BowenBao mentioned this pull request May 2, 2022

Update float16 converter api with auto_convert_mixed_precision #544

Closed

garymm approved these changes May 2, 2022

View reviewed changes

xadupre approved these changes May 5, 2022

View reviewed changes

Extend float16 converter api with auto_mixed_precision

968a7f1

Signed-off-by: BowenBao <bowbao@microsoft.com>

BowenBao force-pushed the bowbao/fp16 branch from 1fe71a6 to 968a7f1 Compare May 5, 2022 17:21

xadupre merged commit f87e306 into onnx:main May 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend float16 converter api with auto_mixed_precision #543

Extend float16 converter api with auto_mixed_precision #543

BowenBao commented Apr 28, 2022

lgtm-com bot commented Apr 28, 2022

garymm commented Apr 28, 2022

yetingqiaqia commented Apr 29, 2022

BowenBao commented Apr 29, 2022

yetingqiaqia commented Apr 29, 2022

garymm commented Apr 29, 2022 •

edited

Loading

BowenBao commented May 2, 2022

garymm commented May 4, 2022

lgtm-com bot commented May 5, 2022

BowenBao commented May 9, 2022

xiaowuhu commented May 25, 2022

yetingqiaqia commented May 31, 2022

Extend float16 converter api with auto_mixed_precision #543

Extend float16 converter api with auto_mixed_precision #543

Conversation

BowenBao commented Apr 28, 2022

lgtm-com bot commented Apr 28, 2022

garymm commented Apr 28, 2022

yetingqiaqia commented Apr 29, 2022

BowenBao commented Apr 29, 2022

yetingqiaqia commented Apr 29, 2022

garymm commented Apr 29, 2022 • edited Loading

BowenBao commented May 2, 2022

garymm commented May 4, 2022

lgtm-com bot commented May 5, 2022

BowenBao commented May 9, 2022

xiaowuhu commented May 25, 2022

yetingqiaqia commented May 31, 2022

garymm commented Apr 29, 2022 •

edited

Loading