Skip to content

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py #24472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 19, 2025

Conversation

tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Apr 19, 2025

Description

  • Rename filename and class name since it supports 4 and 8 bits.
  • Update HQQWeightOnlyQuantizer to support 8 bits.
  • Update some comments.

Motivation and Context

#24384 added 8 bits support for the default weight only quantizer.

@tianleiwu tianleiwu requested a review from fajin-corp April 19, 2025 00:04
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@tianleiwu tianleiwu force-pushed the tlwu/rename_nbits_quantizier branch from 459307e to 538309d Compare April 19, 2025 00:11
Copy link
Contributor

@jiafatom jiafatom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename matmul_4bits_quantizer.py to matmul_quantizer.py? Previously we have 4 bits only, so we add 4bits in between. It seems a matmul_quantizer to me

@tianleiwu
Copy link
Contributor Author

tianleiwu commented Apr 19, 2025

Can we rename matmul_4bits_quantizer.py to matmul_quantizer.py? Previously we have 4 bits only, so we add 4bits in between. It seems a matmul_quantizer to me

nbits naming follows MatMulNBits op.
matmul_quantizer is too general. For example, weights can be quantized to fp8 or fp4 then we need different quantizer name for that.

@tianleiwu tianleiwu merged commit 0d26928 into main Apr 19, 2025
86 of 89 checks passed
@tianleiwu tianleiwu deleted the tlwu/rename_nbits_quantizier branch April 19, 2025 06:09
xiaoyu-work pushed a commit to microsoft/Olive that referenced this pull request Apr 23, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
## Describe your changes

onnxruntime uses MatMulNBitsQuantizer since 1.22.0, due to
microsoft/onnxruntime#24472

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [ ] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [ ] Lint and apply fixes to your code by running `lintrunner -a`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.
- [ ] Is this PR including examples changes? If yes, please remember to
update [example
documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md)
in a follow-up PR.

## (Optional) Issue link
xiaoyu-work pushed a commit to microsoft/Olive that referenced this pull request Apr 23, 2025
## Describe your changes

onnxruntime uses MatMulNBitsQuantizer since 1.22.0, due to
microsoft/onnxruntime#24472

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [ ] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [ ] Lint and apply fixes to your code by running `lintrunner -a`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.
- [ ] Is this PR including examples changes? If yes, please remember to
update [example
documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md)
in a follow-up PR.

## (Optional) Issue link
ashrit-ms pushed a commit that referenced this pull request Apr 24, 2025
### Description

* Rename  filename and class name since it supports 4 and 8 bits.
* Update HQQWeightOnlyQuantizer to support 8 bits.
* Update some comments.

### Motivation and Context
#24384 added 8 bits support
for the default weight only quantizer.
intbf pushed a commit to intbf/onnxruntime that referenced this pull request Apr 25, 2025
…oft#24472)

### Description

* Rename  filename and class name since it supports 4 and 8 bits.
* Update HQQWeightOnlyQuantizer to support 8 bits.
* Update some comments.

### Motivation and Context
microsoft#24384 added 8 bits support
for the default weight only quantizer.

Signed-off-by: bfilipek <bartlomiej.filipek@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants