support fp8 cast WOQ #1746

xin3he · 2024-04-24T06:03:33Z

Type of Change

feature

Description

https://jira.devtools.intel.com/browse/ILITV-3505
support fp8 cast WOQ using official PyTorch >= 2.2

Expected Behavior & Potential Risk

Torch supports 4 fp8 dtypes as shown below. In INC, we use short strings for mapping.

FP8_MAPPING = {
    "fp8_e5m2": torch.float8_e5m2,
    "fp8_e5m2fnuz": torch.float8_e5m2fnuz,
    "fp8_e4m3fn": torch.float8_e4m3fn,
    "fp8_e4m3fnuz": torch.float8_e4m3fnuz,
}

An introduction to the differences between FP8 data types:
Float8E4M3FN, Float8E4M3FNUZ, Float8E5M2, Float8E5M2FNUZ
stablehlo/rfcs/20230321-fp8_fnuz.md at main · openxla/stablehlo (github.com)

How has this PR been tested?

local test

Signed-off-by: xin3he <xin3.he@intel.com>

github-actions · 2024-04-24T06:03:58Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Code Scan Tests workflow

Check ID	Status
Code-Scan	success	✅
Code-Scan (Bandit Code Scan Bandit)	success	✅
Code-Scan (DocStyle Code Scan DocStyle)	success	✅
Code-Scan (Pylint Code Scan Pylint)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/algorithms/weight_only/utility.py, neural_compressor/torch/quantization/config.py.

🟢 Model Tests 3x workflow

Check ID	Status
Model-Test-3x	success	✅
Model-Test-3x (Generate Report GenerateReport)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/algorithms/weight_only/utility.py, neural_compressor/torch/quantization/config.py.

🟢 Unit Tests 3x-PyTorch workflow

Check ID	Status
UT-3x-Torch	success	✅
UT-3x-Torch (Coverage Compare CollectDatafiles)	success	✅
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch)	success	✅
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/rtn.py, neural_compressor/torch/algorithms/weight_only/utility.py, neural_compressor/torch/quantization/config.py, test/3x/torch/quantization/weight_only/test_rtn.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

Signed-off-by: xin3he <xin3.he@intel.com>

xin3he added 2 commits April 24, 2024 13:35

support fp8 cast WOQ

925b0d9

Signed-off-by: xin3he <xin3.he@intel.com>

fix bug

f09f460

Signed-off-by: xin3he <xin3.he@intel.com>

xin3he marked this pull request as draft April 24, 2024 06:03

add log

701ce46

Signed-off-by: xin3he <xin3.he@intel.com>

xin3he marked this pull request as ready for review April 24, 2024 07:10

xin3he requested review from Kaihui-intel and yiliu30 April 24, 2024 07:41

Merge branch 'master' into xinhe/woq_fp8

c5b3932

yiliu30 approved these changes Apr 25, 2024

View reviewed changes

no cover for use_qdq=False

b6fc222

Kaihui-intel approved these changes Apr 26, 2024

View reviewed changes

chensuyue merged commit 57ed613 into master Apr 26, 2024
26 of 27 checks passed

chensuyue deleted the xinhe/woq_fp8 branch April 26, 2024 06:07

zehao-intel pushed a commit that referenced this pull request Apr 26, 2024

support fp8 cast WOQ (#1746)

0f588df

Signed-off-by: xin3he <xin3.he@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support fp8 cast WOQ #1746

support fp8 cast WOQ #1746

xin3he commented Apr 24, 2024 •

edited

Loading

github-actions bot commented Apr 24, 2024 •

edited

Loading

support fp8 cast WOQ #1746

support fp8 cast WOQ #1746

Conversation

xin3he commented Apr 24, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

github-actions bot commented Apr 24, 2024 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

xin3he commented Apr 24, 2024 •

edited

Loading

github-actions bot commented Apr 24, 2024 •

edited

Loading