[Inference] Update fakequant #9140

lixcli · 2024-09-13T12:12:30Z

PR types

Others

PR changes

Others

Description

统一量化接口，使用quant_type: a8w8_fp8进行FP8的W8A8量化设置
更新quantization.md
更新广告生成数据集量化config

2. add llama3.1 and qwen2 ptq config 3. update quantization.md

…nto add_new_fakequant_type

…date_fakequant_0906

remove unuse code

…eNLP into update_0910

paddle-bot · 2024-09-13T12:12:36Z

Thanks for your contribution!

minghaoBD · 2024-09-14T03:14:14Z

llm/experimental/observer/abs_max.py

@@ -0,0 +1,100 @@
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


TODO：observer相关移动到 PaddleSlim

* add a8w8(fp8) a8w8c8(int8) quant_type support * add llama3.1 and qwen2 ptq config * reformat quantization.md and argument.py * update prepare data method for ceval ptq * fix wint4 config bug * use independent avg/abs_max observer * rename fp8 quant_type * update quantization.md * remove ceval in run_finetune.py

lixcli and others added 25 commits August 28, 2024 07:36

1. add a8w8(fp8) a8w8c8(int8) quant_type support

005f2ad

2. add llama3.1 and qwen2 ptq config 3. update quantization.md

fix load_quant_model bug

e56d9c4

fix load quant bug

e2b9a49

update ll/README.md

d21ace7

remove useless code

e89372c

update quant observer config

e7160d3

Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleNLP i…

630b3d6

…nto add_new_fakequant_type

resolve wrong modify

7032bf2

fix prepare_qconfig

323c465

remove unuse files

df416ac

update quantization.md

db61a99

reformat quantization.md and argument.py

f114947

update prepare data method for ceval ptq

8d26cb1

fix wint4 bug

cebf8f0

Merge branch 'develop' of https://github.com/lixcli/PaddleNLP

bc67b75

fix wint4 config bug

5f49b75

use independent avg/abs_max observer

82138f2

rename fp8 quant_type

9d4ffa5

Merge branch 'develop' of https://github.com/lixcli/PaddleNLP into up…

6199e87

…date_fakequant_0906

update quantization.md

5023827

remove ceval in run_finetune.py

fd8df17

Merge branch 'develop' into update_fakequant_0906

fd5fee6

update quantization.md

ebadeb1

remove unuse code

Merge branch update_fakequant_0906 of https://github.com/lixcli/Paddl…

8fe32c5

…eNLP into update_0910

update quantization.md

8f58690

minghaoBD approved these changes Sep 14, 2024

View reviewed changes

yuanlehome approved these changes Sep 14, 2024

View reviewed changes

lixcli requested a review from minghaoBD September 14, 2024 06:33

qingqing01 approved these changes Sep 14, 2024

View reviewed changes

minghaoBD approved these changes Sep 14, 2024

View reviewed changes

qingqing01 merged commit 0832b59 into PaddlePaddle:develop Sep 14, 2024
15 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference] Update fakequant #9140

[Inference] Update fakequant #9140

lixcli commented Sep 13, 2024

paddle-bot bot commented Sep 13, 2024

minghaoBD Sep 14, 2024

		@@ -0,0 +1,100 @@
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

[Inference] Update fakequant #9140

[Inference] Update fakequant #9140

Conversation

lixcli commented Sep 13, 2024

PR types

PR changes

Description

paddle-bot bot commented Sep 13, 2024

minghaoBD Sep 14, 2024

Choose a reason for hiding this comment