songff commited on
Commit
b301f99
·
verified ·
1 Parent(s): efd7465

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -1,3 +1,37 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - songff/GenerAlign
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.2-3B-Instruct
9
+ pipeline_tag: text-generation
10
+ ---
11
+
12
+ Pilot-3B is designed to be a draft model in efficient preference alignment of LLMs for its small size while high performance in general domains. It is trained from [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign).
13
+
14
+ Related links:
15
+
16
+ - Paper: [Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding](https://arxiv.org/abs/2506.07434)
17
+
18
+ - Github: [Weak-to-Strong-Decoding](https://github.com/F2-Song/Weak-to-Strong-Decoding)
19
+
20
+ - Dataset: [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign)
21
+
22
+ # ⚠️Caution
23
+
24
+ Pilot-3B is not guaranteed always to provide safe and correct responses. Please use it at your own risk.
25
+
26
+ # Citation
27
+ If you find this work useful, please consider citing:
28
+ ```bibtex
29
+ @misc{song2025well,
30
+ title={Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding},
31
+ author={Song, Feifan and Wei, Shaohang and Luo, Wen and Fan, Yuxuan and Liu, Tianyu and Wang, Guoyin and Wang, Houfeng},
32
+ year={2025},
33
+ eprint={2506.07434},
34
+ archivePrefix={arXiv},
35
+ primaryClass={cs.CL}
36
+ }
37
+ ```