π₯Quick Start β’ πͺFeatures β’ πResults β’ πIssues β’ πCitation β’ πDocs
Auto-Alignment is a package focusing on scalable and automated alignment methods. We aim to provide the academic community with a series of classic alignment baselines and ready-to-use automated alignment algorithms. This toolkit is designed to facilitate research in the field of LLM alignment.
The core functionalities of the toolkit include:
- Implementation of common alignment operators (e.g., SFT, RM, DPO, GRPO, etc.)
- Implementation of various automatic model alignment algorithms (e.g., CAI, SPIN, RLCD, etc.)
- Efficient model sampling
- Automated model evaluation
- After training intervention methods (e.g., Model Averaging, etc.)
This toolkit can seamlessly integrate with popular resource scheduling systems like Slurm.
[2025.7.25] Liger Kernel for Supervised Fine-Tuning (SFT) has been integrated! 20%-25% faster than the previous version.
[2025.6.19] We've integrated comprehensive safety evaluation datasets and released the Megatron implementation for large-scale distributed training.
[2025.5.23] AutoAlign has been accepted for presentation at the ACL 2025 Demo Track!
[2025.1.1] AutoAlign now supports Self-Rewarding and Constitutional AI.
[2024.8.23] We are excited to announce the release of AutoAlign v0.0.1! In this first version, we have implemented a variety of alignment operators, including SFT, RM, and DPO, along with user-friendly evaluation systems and several auto-alignment algorithms (CAI, PCON, and a variant of RLCD), with more algorithms on the way. Stay tuned for future updates! π₯π₯π₯
Default
conda create -n ata python=3.10
conda activate ata
pip install .[train]
# alternatively, use uv for improved installation speed
# pip install uv
# uv pip install .[train]Evaluation (Optional)
conda create -n ata_eval --clone ata
conda activate ata_eval
pip install .[eval]
bash ./scripts/post_install.shTo facilitate the community with out-of-box alignment data. We publicly release a collection of SFT data here. This toolkit currently utilizes the format in data/dummy_sft.json for supervised fine-tuning and the format in data/dummy_dpo.json for the reinforcement learning process. Please refer to these files for more details.
autoalign-cli sft \
--model_name_or_path "Qwen/Qwen2.5-7B" \
--data_path "data/dummy_sft.json" \
--bf16 True \
--output_dir "models/qwen2-7b-sft" \
--model_max_length 4096 \
--conv_template_name chatml \
--deepspeed "configs/zero3.json"You can first generate demo dataset by python algorithms/rm/prepare_demo_rm.py
And then run the following command:
autoalign-cli rm --model_name_or_path "meta-llama/Meta-Llama-3-8B-Instruct" \
--data_path data/ultra_binary.jsonl \
--bf16 True \
--eval_path data/eval \
--conv_template_name llama-3-instruct \
--output_dir models/llama3_rm \
--deepspeed configs/zero3.jsonautoalign-cli dpo --model_name_or_path "Qwen/Qwen2.5-7B-Instruct" \
--data_path "data/dummy_dpo.json" \
--bf16 True \
--output_dir "saved_models/qwen2-7b-dpo" \
--conv_template_name chatml \
--deepspeed "configs/zero3.json"For detailed scripts configuration, please refer to the megatron.md.
- Setup the environment:
conda activate ata
pip install -e .
bash scripts/train/megatron/env_install.shSetting up the environment may take half an hour as it involves compiling Apex, Flash-Attn, and Transformer Engine. Please be patient.
- Convert weights from Hugging Face to Megatron:
bash scripts/train/megatron/convert/qwen2_5/convert_hf_to_mcore.sh- Preprocess the data:
# For SFT
bash scripts/train/megatron/preprocess/sft_conv.sh
# For DPO
bash scripts/train/megatron/preprocess/dpo_conv.sh- Run the training:
# For SFT
bash scripts/train/megatron/train/qwen2_5/sft_conv.sh
# For DPO
bash scripts/train/megatron/train/qwen2_5/dpo_conv.sh- Convert weights from Megatron back to Hugging Face::
# For SFT
bash scripts/train/megatron/convert/qwen2_5/convert_mcore_to_hf.sh
# For DPO
bash scripts/train/megatron/convert/qwen2_5/convert_mcore_to_hf_dpo.shautoalign-cli infer --backend "vllm" \
--model-name "Qwen2-0.5B-Instruct" \
--model-path "Qwen/Qwen2-0.5B-Instruct" \
--test-file "data/dummy_sft.json" \
--template "chatml" \
--source "qwen2_0_5b_instruct_dummy"autoalign-cli serve --checkpoint-path "Qwen/Qwen2.5-7B-Instruct" \
--mode "browser" \
--template "chatml"autoalign-cli merge --model_paths "psmathur/orca_mini_v3_13b" "WizardLM/WizardLM-13B-V1.2" "garage-bAInd/Platypus2-13B" \
--merged_model_path "merged_model" \
--merging_method "average"The introduction and scripts for each automated alignment algorithm are stored in the algorithms folder.
Currently, we implemented the following automated alignment algorithms:
| Algorithm Name | Discription |
|---|---|
| rlcd_sys | Context distilling the principles into the models using system prompts. |
| pcon | Treat the response from larger models as postitive signals, and the response from small models from negative signals. |
| self-rewarding | Use language model itself to provide its own rewards during training via LLM-as-a-Judge prompting. |
| cai | A method for training LLMs to be harmless by using a set of guiding principles and a process of self-improvement through supervised and reinforcement learning. |
autoalign-cli eval --config-path configs/eval.yamlYou can configure evaluation options in the file eval.yaml. For objective evaluation, the results will be displayed in outputs/{model_id}/ordered_res.txt at the root directory of the repository. For more information, please read docs/eval.md.
Documents of this toolkit is stored at ./docs/.
| Model | Template Name |
|---|---|
| Llama-2 | llama-2-chat |
| Llama-3/3.1/3.2 | llama-3-instruct |
| Gemma 2 | gemma |
| Qwen1.5 | chatml |
| Qwen2 | chatml |
| Qwen2.5 | chatml |
| Qwen3 | chatml-with-empty-think |
| Mistral v0.1/0.2/0.3 | mistral-instruct / zepyhr |
| Yi1.5 | chatml |
| Model | Dataset / Algorithm | MT-Bench | MATH | GSM-8K | HumanEval | MBPP | HumanEval-CN | MBPP-CN | MMLU | GPQA | CMMLU | C-Eval |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Llama-3-8b | Base | -- | 13.28 | 55.5 | 26.83 | 44.2 | 20.12 | 41.2 | x | 10.61 | 40.49 | x |
| Llama-3-8b | Instruct | 7.95 | 25 | 78.62 | 52.44 | 51.2 | 24.39 | 47.8 | 59.27 | 25.76 | 52.06 | 50.93 |
| Llama-3.1-8b | Ultrachat | 6.47 | 10.88 | 52.08 | 29.27 | 33.6 | 16.46 | 30.2 | 55.37 | 26.77 | 47.19 | 43.81 |
| Llama-3.1-8b | Base | -- | 15.24 | 56.41 | 27.44 | 42.8 | 23.78 | 41.8 | x | 12.63 | 45.43 | x |
| Llama-3.1-8b | Instruct | 7.73 | 31.56 | 81.12 | 54.27 | 53 | 39.02 | 50.2 | 59.88 | 28.79 | 49.64 | 48.3 |
| Llama-3.1-8b | Ultrachat | 6.51 | 13.6 | 57.39 | 25.61 | 33.4 | 20.73 | 31.4 | 55.66 | 25.76 | 48.44 | 32.99 |
| Llama-3.2-1b | Infinity-Instruct | -- | 12.66 | 48.07 | 28.66 | 31.2 | 27.44 | 26.6 | 44.8 | x | 33.86 | 34.04 |
| Qwen-2-7b | Base | 5.03 | 41.3 | 79.76 | 61.59 | 51 | 60.37 | 48.4 | 62.4 | 31.31 | 67.72 | 42.66 |
| Qwen-2-7b | Instruct | 8.15 | 25.38 | 81.35 | 51.22 | 48.6 | 61.59 | 24.2 | 64.1 | 31.82 | 62.24 | 46.04 |
| Qwen-2-7b | Ultrachat | 7.34 | 37.98 | 77.41 | 20.73 | 34.6 | 11.59 | 32.8 | 61.35 | 31.31 | 72.23 | 63.18 |
| Qwen-2-7b | rlcd_sys | 7.29 | 20.76 | 52.31 | 35.98 | 36 | 29.88 | 35.4 | 52.89 | 21.21 | 68.98 | 71.35 |
| Qwen-2-7b | pcon | 6.6 | 35.37 | 47.43 | 42.54 | 79.83 | 41.46 | 50.4 | 57.32 | 46.8 | 63.31 | 28.28 |
| Qwen-2.5-7b | Infinity-Instruct | 6.85 | 39.44 | 84.08 | 71.95 | 58.40 | 64.02 | 55.20 | 74.51 | 37.88 | 78.79 | 80.23 |
| Qwen-3-1.7b | Infinity-Instruct | -- | 33.76 | 71.87 | 57.93 | 40.2 | 47.56 | 38.4 | 63.37 | 25.25 | 61.03 | 64.1 |
"--" indicates data unavailable;
"x" indicates inability to obtain valid performance using OpenCompass default prompt.
- Chat-Arena-Hard
- AlignBench
- Representation Engineering
- Tool use integration
If you would like to contribute to this project, please follow these guidelines:
-
Fork the repository.
-
Install for Develop
pip install -e .[dev] pre-commit install
-
Create a new branch.
-
Make your changes.
-
Submit a pull request.
@inproceedings{lu-etal-2025-autoalign,
title = "{A}uto{A}lign: Get Your {LLM} Aligned with Minimal Annotations",
author = "Lu, Xinyu and
Xu, Dong and
Zhang, Chunkang and
Guan, Xinyan and
Wang, Junxiang and
Zhang, Qingyu and
Wang, Pengbo and
Mao, Yingzhi and
Xiang, Hao and
Wen, Xueru and
Li, Zichao and
Lu, Yaojie and
Lin, Hongyu and
Sun, Le and
Han, Xianpei",
editor = "Mishra, Pushkar and
Muresan, Smaranda and
Yu, Tao",
booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.acl-demo.19/",
pages = "189--198",
ISBN = "979-8-89176-253-4",
}This project takes inspiration from several other projects, including: Llama-Factory, verl, 360-Llama-Factory. Great thanks to the developers of these projects for their contributions to the community.
This project is licensed under the Apache-2.0 License.

