songff
/

Pilot-3B

Text Generation

text-generation-inference

Model card Files Files and versions

Pilot-3B / README.md

songff's picture

Update README.md

1bf4214 verified 8 months ago

|

history blame contribute delete

1.33 kB

	---
	base_model:
	- meta-llama/Llama-3.2-3B-Instruct
	datasets:
	- songff/GenerAlign
	language:
	- en
	license: apache-2.0
	pipeline_tag: text-generation
	library_name: transformers
	---

	Pilot-3B is designed to be a draft model in efficient preference alignment of LLMs for its small size while high performance in general domains. It is trained from [Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign).

	Related links:

	- Paper: [Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding](https://arxiv.org/abs/2506.07434)

	- Github: [Weak-to-Strong-Decoding](https://github.com/F2-Song/Weak-to-Strong-Decoding)

	- Dataset: [GenerAlign](https://huggingface.co/datasets/songff/GenerAlign)

	# ⚠️Caution

	Pilot-3B is not guaranteed always to provide safe and correct responses. Please use it at your own risk.

	# Citation
	If you find this work useful, please consider citing:
	```bibtex
	@misc{song2025well,
	title={Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding},
	author={Song, Feifan and Wei, Shaohang and Luo, Wen and Fan, Yuxuan and Liu, Tianyu and Wang, Guoyin and Wang, Houfeng},
	year={2025},
	eprint={2506.07434},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```