Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,37 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- songff/GenerAlign
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
base_model:
|
| 8 |
+
- meta-llama/Llama-3.2-3B-Instruct
|
| 9 |
+
pipeline_tag: text-generation
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
Pilot-3B is designed to be a draft model in efficient preference alignment of LLMs for its small size while high performance in general domains. It is trained from [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign).
|
| 13 |
+
|
| 14 |
+
Related links:
|
| 15 |
+
|
| 16 |
+
- Paper: [Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding](https://arxiv.org/abs/2506.07434)
|
| 17 |
+
|
| 18 |
+
- Github: [Weak-to-Strong-Decoding](https://github.com/F2-Song/Weak-to-Strong-Decoding)
|
| 19 |
+
|
| 20 |
+
- Dataset: [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign)
|
| 21 |
+
|
| 22 |
+
# ⚠️Caution
|
| 23 |
+
|
| 24 |
+
Pilot-3B is not guaranteed always to provide safe and correct responses. Please use it at your own risk.
|
| 25 |
+
|
| 26 |
+
# Citation
|
| 27 |
+
If you find this work useful, please consider citing:
|
| 28 |
+
```bibtex
|
| 29 |
+
@misc{song2025well,
|
| 30 |
+
title={Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding},
|
| 31 |
+
author={Song, Feifan and Wei, Shaohang and Luo, Wen and Fan, Yuxuan and Liu, Tianyu and Wang, Guoyin and Wang, Houfeng},
|
| 32 |
+
year={2025},
|
| 33 |
+
eprint={2506.07434},
|
| 34 |
+
archivePrefix={arXiv},
|
| 35 |
+
primaryClass={cs.CL}
|
| 36 |
+
}
|
| 37 |
+
```
|