[WIP]: Clean auto_gptq and auto_awq #2917

LRL2-ModelCloud · 2025-11-19T08:04:58Z

No description provided.

BenjaminBossan · 2025-11-20T16:10:14Z

Thanks for working on this. LMK when I should take a look.

Qubitium · 2025-11-21T12:19:19Z

@BenjaminBossan CI test status. This Peft pr with sister Transformer PR. huggingface/transformers#41567

This PR is ready for prelim reivew. There is not much to it for Peft as most of the changes are in Transformer PR.

CI Passing status using GPT-QModel main branch:

transformers/tests/quantization/autoawq/test_awq.py:
test_awq.py::AwqTest::test_quantized_model PASSED
test_awq.py::AwqTest::test_quantized_model_bf16 PASSED
test_awq.py::AwqTest::test_quantized_model_conversion PASSED
test_awq.py::AwqTest::test_quantized_model_exllama FAILED <-- Needs fixing. 
test_awq.py::AwqTest::test_quantized_model_multi_accelerator SKIPPED
test_awq.py::AwqTest::test_quantized_model_no_device_map PASSED
test_awq.py::AwqTest::test_save_pretrained PASSED
test_awq.py::AwqTest::test_raise_if_non_quantized PASSED
test_awq.py::AwqTest::test_quantized_model_no_k_proj_quantized PASSED
test_awq.py::AwqScaleTest::test_load_quantized_model PASSED
test_awq.py::AwqIPEXTest::test_quantized_model_ipex PASSED <-- test needs to be renamed to AwqTorchFused (ipex removed)
 
peft/tests/test_gpu_examples.py:
PeftAwqGPUTests PASSED
PeftGPTQGPUTests PASSED

BenjaminBossan

Very nice reduction in complexity, thanks for working on this. Is the transformers PR a prerequisite for this PR to work? Then we should wait for that PR to land first and update the min transformers version here for completeness.

BenjaminBossan · 2025-11-21T14:50:06Z

examples/sft/README.md

-## Multi-GPU SFT with LoRA and FSDP for GPTQModel:
-As in [Multi-GPU SFT with LoRA and FSDP](https://github.com/huggingface/peft/blob/main/examples/sft/README.md#multi-gpu-sft-with-lora-and-fsdp), we also support other quantization methods like GPTQModel. You may need to install [GPTQModel](https://github.com/ModelCloud/GPTQModel) > v3.0.0 or from source. Here is the launch command for reference: [run_peft_fsdp_gptq.sh]. For the `--model_name_or_path` argument, it is important to pass a model that is already quantized with GPTQModel, like `"hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4"`.
+## Multi-GPU SFT with LoRA and FSDP for GPT-QModel:
+As in [Multi-GPU SFT with LoRA and FSDP](https://github.com/huggingface/peft/blob/main/examples/sft/README.md#multi-gpu-sft-with-lora-and-fsdp), we also support other quantization methods like GPT-QModel. You may need to install [GPT-QModel](https://github.com/ModelCloud/GPTQModel) > v3.0.0 or from source. Here is the launch command for reference: [run_peft_fsdp_gptq.sh]. For the `--model_name_or_path` argument, it is important to pass a model that is already quantized with GPT-QModel, like `"hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4"`.


Just inquiring: Is > v3.0.0 still valid or should a higher version be recommended?

Just inquiring: Is > v3.0.0 still valid or should a higher version be recommended?

Yes. We we need to bump the min version to v5.4.2 (releasing today). Waiting for the exllama test to flip from FAIL to PASS for the the Transformer PR so I can be sure there are not other changes required for gpt-qmodel so the release/udpate can happen.

LRL2-ModelCloud added 4 commits November 19, 2025 14:56

clean auto_gptq

31d59ca

remove auto_awq require

e545e54

cleanup

645c992

add require_gptqmodel

73d3b04

LRL2-ModelCloud changed the title ~~Clean auto_gptq and auto_awq~~ [WIP]: Clean auto_gptq and auto_awq Nov 20, 2025

Qubitium mentioned this pull request Nov 20, 2025

CI: Add gptqmodel to the CI #2342

Open

update GPTQModel -> GPT-QModel

0ccd11f

Qubitium mentioned this pull request Nov 20, 2025

[WIP] Fully deprecate AutoGPTQ and AutoAWQ for GPT-QModel huggingface/transformers#41567

Draft

BenjaminBossan reviewed Nov 21, 2025

View reviewed changes

pin gpt-qmodel version to v5.4.2

eb572f6

Qubitium mentioned this pull request Nov 28, 2025

[WIP] Fully deprecate AutoGPTQ for GPT-QModel huggingface/optimum#2385

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP]: Clean auto_gptq and auto_awq #2917

[WIP]: Clean auto_gptq and auto_awq #2917

Uh oh!

LRL2-ModelCloud commented Nov 19, 2025

Uh oh!

BenjaminBossan commented Nov 20, 2025

Uh oh!

Qubitium commented Nov 21, 2025 •

edited

Loading

Uh oh!

BenjaminBossan left a comment

Uh oh!

BenjaminBossan Nov 21, 2025

Uh oh!

Qubitium Nov 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP]: Clean auto_gptq and auto_awq #2917

Are you sure you want to change the base?

[WIP]: Clean auto_gptq and auto_awq #2917

Uh oh!

Conversation

LRL2-ModelCloud commented Nov 19, 2025

Uh oh!

BenjaminBossan commented Nov 20, 2025

Uh oh!

Qubitium commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Qubitium Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Qubitium commented Nov 21, 2025 •

edited

Loading

Qubitium Nov 22, 2025 •

edited

Loading