Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnnxMatMul4Quantizer: Suppress quantizer logs, no tmp_dir #680

Merged
merged 2 commits into from
Nov 2, 2023

Conversation

jambayk
Copy link
Contributor

@jambayk jambayk commented Nov 1, 2023

Describe your changes

MatMul4BitsQuantizer uses logging.basicConfig to set level to INFO and prints very verbose logs at INFO level. Suppress these logs by manually setting the loggers level to ERROR.
Update ort version requirement to >=1.16.2 since the quantizer will be added to 1.16.2.

Remove the save to tmp_dir -> load -> save steps since it is not needed. We only need to sort the model topologically before saving the model to file. Refer to this discussion for more context #641 (comment).

llama.py: Correct description for --only_config option. Create a new user_script file for the workflow instead of updating the original one.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Format your code by running pre-commit run --all-files
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

@jambayk jambayk changed the title OnnxMatMul4Quantizer: Suppress matmul_4bits_quantizer logs and don't use tmp_dir. OnnxMatMul4Quantizer: Suppress matmul_4bits_quantizer logs, no tmp_dir Nov 1, 2023
examples/llama2/llama2.py Fixed Show resolved Hide resolved
examples/llama2/llama2.py Fixed Show resolved Hide resolved
@jambayk jambayk changed the title OnnxMatMul4Quantizer: Suppress matmul_4bits_quantizer logs, no tmp_dir OnnxMatMul4Quantizer: Suppress quantizer logs, no tmp_dir Nov 1, 2023

from onnxruntime.quantization.matmul_4bits_quantizer import MatMul4BitsQuantizer

# MatMul4BitsQuantizer sets basicConfig to INFO and prints verbose logs as INFO, suppress them
# TODO(jambayk): Use the ort logging severity from the workflow if we expose it as environment variable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the ort logging severity only impact the logging in c++ code. when I tried last week, I have to use getLogger().setLevel to change the root logger verbosity to make sure the messages could be logged out.

so, do we need change the root logger level so to output the ORT python messages?

Copy link
Contributor Author

@jambayk jambayk Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I looked at the ort logging severity only works for c++.
I was thinking whether we would want (in the future) to set the same logging level for ort python logging by setting the logger for the ort modules at the same level. If we want to do that, we have to store the logging level somewhere. maybe an environment variable.

We don't need to change the root logger. Just need to change the level for the logger for the given name space.
I don't think it is recommended for packages to do anything to the root logger or use basic config. The ort package should not have done that too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this week, I have to manually set the root logger level to troubleshoot why the quant_pre_process silently fail(the process exit silently). We need find way to config the ORT python logger level from troubleshooting point of view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good to me. If the logger is created using logging.getLogger(__name__), it makes it easier for us since we could do logging.getLogger("onnxruntime").setLevel(...) to set it for ort.
Or else, I don't know what else we could do without touching the root logger/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I tried to logging.getLogger("onnxruntime").setLevel(...) but the log is not shown correctly. I don't spend time to investigate the root cause.

@jambayk jambayk merged commit 72bab88 into main Nov 2, 2023
31 checks passed
@jambayk jambayk deleted the jambayk/matmul4-update branch November 2, 2023 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants