Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py #24472

tianleiwu · 2025-04-19T00:04:12Z

Description

Rename filename and class name since it supports 4 and 8 bits.
Update HQQWeightOnlyQuantizer to support 8 bits.
Update some comments.

Motivation and Context

#24384 added 8 bits support for the default weight only quantizer.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/python/tools/quantization/matmul_nbits_quantizer.py

onnxruntime/python/tools/transformers/models/llama/convert_to_onnx.py

onnxruntime/python/tools/transformers/models/phi2/convert_to_onnx.py

onnxruntime/python/tools/quantization/matmul_nbits_quantizer.py

jiafatom

Can we rename matmul_4bits_quantizer.py to matmul_quantizer.py? Previously we have 4 bits only, so we add 4bits in between. It seems a matmul_quantizer to me

tianleiwu · 2025-04-19T06:08:56Z

Can we rename matmul_4bits_quantizer.py to matmul_quantizer.py? Previously we have 4 bits only, so we add 4bits in between. It seems a matmul_quantizer to me

nbits naming follows MatMulNBits op.
matmul_quantizer is too general. For example, weights can be quantized to fp8 or fp4 then we need different quantizer name for that.

## Describe your changes onnxruntime uses MatMulNBitsQuantizer since 1.22.0, due to microsoft/onnxruntime#24472 ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link

### Description * Rename filename and class name since it supports 4 and 8 bits. * Update HQQWeightOnlyQuantizer to support 8 bits. * Update some comments. ### Motivation and Context #24384 added 8 bits support for the default weight only quantizer.

…oft#24472) ### Description * Rename filename and class name since it supports 4 and 8 bits. * Update HQQWeightOnlyQuantizer to support 8 bits. * Update some comments. ### Motivation and Context microsoft#24384 added 8 bits support for the default weight only quantizer. Signed-off-by: bfilipek <bartlomiej.filipek@intel.com>

tianleiwu requested a review from fajin-corp April 19, 2025 00:04

github-actions bot reviewed Apr 19, 2025

View reviewed changes

github-advanced-security bot found potential problems Apr 19, 2025

View reviewed changes

onnxruntime/python/tools/transformers/models/phi2/convert_to_onnx.py Fixed Show fixed Hide fixed

onnxruntime/python/tools/transformers/models/phi2/convert_to_onnx.py Fixed Show fixed Hide fixed

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py

Loading
Loading status checks…

538309d

tianleiwu force-pushed the tlwu/rename_nbits_quantizier branch from 459307e to 538309d Compare April 19, 2025 00:11

minor refactoring

Loading
Loading status checks…

55e184a

tianleiwu requested review from jiafatom and kunal-vaishnavi April 19, 2025 00:26

remove debug code

Loading
Loading status checks…

e46f157

kunal-vaishnavi reviewed Apr 19, 2025

View reviewed changes

onnxruntime/python/tools/quantization/matmul_nbits_quantizer.py Show resolved Hide resolved

kunal-vaishnavi approved these changes Apr 19, 2025

View reviewed changes

jiafatom reviewed Apr 19, 2025

View reviewed changes

tianleiwu merged commit 0d26928 into main Apr 19, 2025
86 of 89 checks passed

tianleiwu deleted the tlwu/rename_nbits_quantizier branch April 19, 2025 06:09

jiafatom mentioned this pull request Apr 23, 2025

Adapt to MatMulNBitsQuantizer microsoft/Olive#1787

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py #24472

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py #24472

tianleiwu commented Apr 19, 2025 •

edited

Loading

github-actions bot left a comment

jiafatom left a comment

tianleiwu commented Apr 19, 2025 •

edited

Loading

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py #24472

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py #24472

Conversation

tianleiwu commented Apr 19, 2025 • edited Loading

Description

Motivation and Context

github-actions bot left a comment

Choose a reason for hiding this comment

jiafatom left a comment

Choose a reason for hiding this comment

tianleiwu commented Apr 19, 2025 • edited Loading

tianleiwu commented Apr 19, 2025 •

edited

Loading

tianleiwu commented Apr 19, 2025 •

edited

Loading