| | --- |
| | base_model: |
| | - meta-llama/Llama-3.2-3B-Instruct |
| | datasets: |
| | - songff/GenerAlign |
| | language: |
| | - en |
| | license: apache-2.0 |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | --- |
| | |
| | Pilot-3B is designed to be a draft model in efficient preference alignment of LLMs for its small size while high performance in general domains. It is trained from [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign). |
| |
|
| | Related links: |
| |
|
| | - Paper: [Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding](https://arxiv.org/abs/2506.07434) |
| |
|
| | - Github: [Weak-to-Strong-Decoding](https://github.com/F2-Song/Weak-to-Strong-Decoding) |
| |
|
| | - Dataset: [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign) |
| |
|
| | # ⚠️Caution |
| |
|
| | Pilot-3B is not guaranteed always to provide safe and correct responses. Please use it at your own risk. |
| |
|
| | # Citation |
| | If you find this work useful, please consider citing: |
| | ```bibtex |
| | @misc{song2025well, |
| | title={Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding}, |
| | author={Song, Feifan and Wei, Shaohang and Luo, Wen and Fan, Yuxuan, Liu, Tianyu and Wang, Guoyin and Wang, Houfeng}, |
| | year={2025}, |
| | eprint={2506.07434}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL} |
| | } |
| | ``` |