-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OnnxMatMul4Quantizer: Suppress quantizer logs, no tmp_dir #680
Conversation
|
||
from onnxruntime.quantization.matmul_4bits_quantizer import MatMul4BitsQuantizer | ||
|
||
# MatMul4BitsQuantizer sets basicConfig to INFO and prints verbose logs as INFO, suppress them | ||
# TODO(jambayk): Use the ort logging severity from the workflow if we expose it as environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the ort logging severity only impact the logging in c++ code. when I tried last week, I have to use getLogger().setLevel to change the root logger verbosity to make sure the messages could be logged out.
so, do we need change the root logger level so to output the ORT python messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I looked at the ort logging severity only works for c++.
I was thinking whether we would want (in the future) to set the same logging level for ort python logging by setting the logger for the ort modules at the same level. If we want to do that, we have to store the logging level somewhere. maybe an environment variable.
We don't need to change the root logger. Just need to change the level for the logger for the given name space.
I don't think it is recommended for packages to do anything to the root logger or use basic config. The ort package should not have done that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this week, I have to manually set the root logger level to troubleshoot why the quant_pre_process silently fail(the process exit silently). We need find way to config the ORT python logger level from troubleshooting point of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds good to me. If the logger is created using logging.getLogger(__name__)
, it makes it easier for us since we could do logging.getLogger("onnxruntime").setLevel(...)
to set it for ort.
Or else, I don't know what else we could do without touching the root logger/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I tried to logging.getLogger("onnxruntime").setLevel(...) but the log is not shown correctly. I don't spend time to investigate the root cause.
Describe your changes
MatMul4BitsQuantizer
uses logging.basicConfig to set level to INFO and prints very verbose logs at INFO level. Suppress these logs by manually setting the loggers level toERROR
.Update ort version requirement to
>=1.16.2
since the quantizer will be added to 1.16.2.Remove the save to tmp_dir -> load -> save steps since it is not needed. We only need to sort the model topologically before saving the model to file. Refer to this discussion for more context #641 (comment).
llama.py
: Correct description for--only_config
option. Create a new user_script file for the workflow instead of updating the original one.Checklist before requesting a review
pre-commit run --all-files
(Optional) Issue link