Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference] Update fakequant #9140

Merged
merged 25 commits into from
Sep 14, 2024
Merged

Conversation

lixcli
Copy link
Contributor

@lixcli lixcli commented Sep 13, 2024

PR types

Others

PR changes

Others

Description

  1. 统一量化接口,使用quant_type: a8w8_fp8进行FP8的W8A8量化设置
  2. 更新quantization.md
  3. 更新广告生成数据集量化config

Copy link

paddle-bot bot commented Sep 13, 2024

Thanks for your contribution!

@@ -0,0 +1,100 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO:observer相关 移动到 PaddleSlim

@qingqing01 qingqing01 merged commit 0832b59 into PaddlePaddle:develop Sep 14, 2024
15 of 22 checks passed
lvdongyi pushed a commit to lvdongyi/PaddleNLP that referenced this pull request Sep 14, 2024
* add a8w8(fp8) a8w8c8(int8) quant_type support
* add llama3.1 and qwen2 ptq config
* reformat quantization.md and argument.py
* update prepare data method for ceval ptq
* fix wint4 config bug
* use independent avg/abs_max observer
* rename fp8 quant_type
* update quantization.md
* remove ceval in run_finetune.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants