QuantFactory
/

LLaMA-3-8B-SFR-SFT-R-GGUF

Text Generation

Model card Files Files and versions

LLaMA-3-8B-SFR-SFT-R-GGUF / README.md

munish0838's picture

Create README.md

3e8d1a8 verified over 1 year ago

|

history blame contribute delete

1.09 kB

	---
	license: llama3
	pipeline_tag: text-generation
	base_model: Salesforce/LLaMA-3-8B-SFR-SFT-R
	---
	# LLaMA-3-8B-SFR-SFT-R-GGUF
	This is quzntized version of [Salesforce/LLaMA-3-8B-SFR-SFT-R](https://huggingface.co/Salesforce/LLaMA-3-8B-SFR-SFT-R) created using llama.cpp


	# Model Description
	This is the SFT model for Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R.

	## Model Releases
	- [SFT model](https://huggingface.co/Salesforce/LLaMA-3-8B-SFR-SFT-R)
	- [Reward model](https://huggingface.co/Salesforce/LLaMA-3-8B-SFR-RM-R)
	- [RLHF model](https://huggingface.co/Salesforce/LLaMA-3-8B-SFR-Iterative-DPO-R)


	## Original Model Citation
	Please cite our techical report if you find our model is useful for your research or product.

	```bibtex
	@misc{dong2024rlhf,
	title={RLHF Workflow: From Reward Modeling to Online RLHF},
	author={Hanze Dong and Wei Xiong and Bo Pang and Haoxiang Wang and Han Zhao and Yingbo Zhou and Nan Jiang and Doyen Sahoo and Caiming Xiong and Tong Zhang},
	year={2024},
	eprint={2405.07863},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```