|
|
--- |
|
|
base_model: |
|
|
- meta-llama/Llama-3.2-3B-Instruct |
|
|
datasets: |
|
|
- songff/GenerAlign |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
Pilot-3B is designed to be a draft model in efficient preference alignment of LLMs for its small size while high performance in general domains. It is trained from [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign). |
|
|
|
|
|
Related links: |
|
|
|
|
|
- Paper: [Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding](https://arxiv.org/abs/2506.07434) |
|
|
|
|
|
- Github: [Weak-to-Strong-Decoding](https://github.com/F2-Song/Weak-to-Strong-Decoding) |
|
|
|
|
|
- Dataset: [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign) |
|
|
|
|
|
# ⚠️Caution |
|
|
|
|
|
Pilot-3B is not guaranteed always to provide safe and correct responses. Please use it at your own risk. |
|
|
|
|
|
# Citation |
|
|
If you find this work useful, please consider citing: |
|
|
```bibtex |
|
|
@misc{song2025well, |
|
|
title={Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding}, |
|
|
author={Song, Feifan and Wei, Shaohang and Luo, Wen and Fan, Yuxuan and Liu, Tianyu and Wang, Guoyin and Wang, Houfeng}, |
|
|
year={2025}, |
|
|
eprint={2506.07434}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL} |
|
|
} |
|
|
``` |