Skip to content

Conversation

@LRL2-ModelCloud
Copy link

No description provided.

@LRL2-ModelCloud LRL2-ModelCloud changed the title Clean auto_gptq and auto_awq [WIP]: Clean auto_gptq and auto_awq Nov 20, 2025
@BenjaminBossan
Copy link
Member

Thanks for working on this. LMK when I should take a look.

@Qubitium
Copy link
Contributor

Qubitium commented Nov 21, 2025

@BenjaminBossan CI test status. This Peft pr with sister Transformer PR. huggingface/transformers#41567

This PR is ready for prelim reivew. There is not much to it for Peft as most of the changes are in Transformer PR.

CI Passing status using GPT-QModel main branch:

transformers/tests/quantization/autoawq/test_awq.py:
test_awq.py::AwqTest::test_quantized_model PASSED
test_awq.py::AwqTest::test_quantized_model_bf16 PASSED
test_awq.py::AwqTest::test_quantized_model_conversion PASSED
test_awq.py::AwqTest::test_quantized_model_exllama FAILED <-- Needs fixing. 
test_awq.py::AwqTest::test_quantized_model_multi_accelerator SKIPPED
test_awq.py::AwqTest::test_quantized_model_no_device_map PASSED
test_awq.py::AwqTest::test_save_pretrained PASSED
test_awq.py::AwqTest::test_raise_if_non_quantized PASSED
test_awq.py::AwqTest::test_quantized_model_no_k_proj_quantized PASSED
test_awq.py::AwqScaleTest::test_load_quantized_model PASSED
test_awq.py::AwqIPEXTest::test_quantized_model_ipex PASSED <-- test needs to be renamed to AwqTorchFused (ipex removed)
 
peft/tests/test_gpu_examples.py:
PeftAwqGPUTests PASSED
PeftGPTQGPUTests PASSED

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice reduction in complexity, thanks for working on this. Is the transformers PR a prerequisite for this PR to work? Then we should wait for that PR to land first and update the min transformers version here for completeness.

## Multi-GPU SFT with LoRA and FSDP for GPTQModel:
As in [Multi-GPU SFT with LoRA and FSDP](https://github.com/huggingface/peft/blob/main/examples/sft/README.md#multi-gpu-sft-with-lora-and-fsdp), we also support other quantization methods like GPTQModel. You may need to install [GPTQModel](https://github.com/ModelCloud/GPTQModel) > v3.0.0 or from source. Here is the launch command for reference: [run_peft_fsdp_gptq.sh]. For the `--model_name_or_path` argument, it is important to pass a model that is already quantized with GPTQModel, like `"hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4"`.
## Multi-GPU SFT with LoRA and FSDP for GPT-QModel:
As in [Multi-GPU SFT with LoRA and FSDP](https://github.com/huggingface/peft/blob/main/examples/sft/README.md#multi-gpu-sft-with-lora-and-fsdp), we also support other quantization methods like GPT-QModel. You may need to install [GPT-QModel](https://github.com/ModelCloud/GPTQModel) > v3.0.0 or from source. Here is the launch command for reference: [run_peft_fsdp_gptq.sh]. For the `--model_name_or_path` argument, it is important to pass a model that is already quantized with GPT-QModel, like `"hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4"`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just inquiring: Is > v3.0.0 still valid or should a higher version be recommended?

Copy link
Contributor

@Qubitium Qubitium Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just inquiring: Is > v3.0.0 still valid or should a higher version be recommended?

Yes. We we need to bump the min version to v5.4.2 (releasing today). Waiting for the exllama test to flip from FAIL to PASS for the the Transformer PR so I can be sure there are not other changes required for gpt-qmodel so the release/udpate can happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants