qgyd2021
/

reward_model_gpt2_stack_exchange

Text Generation

Model card Files Files and versions

reward_model_gpt2_stack_exchange / README.md

qgyd2021's picture

Update README.md

e027c52 over 2 years ago

|

736 Bytes

	---
	license: apache-2.0
	datasets:
	- lvwerra/stack-exchange-paired
	language:
	- en
	library_name: adapter-transformers
	pipeline_tag: text-generation
	tags:
	- reward_model
	---
	## Reward Model GPT2

	fine-tuned [GPT2](https://huggingface.co/gpt2) to a reward model.

	The model is designed to generate human-like responses to questions in [Stack Exchange](https://huggingface.co/datasets/lvwerra/stack-exchange-paired) domains of programming, mathematics, physics, and more.

	For training code check the github [example](https://github.com/huggingface/trl/blob/main/examples/research_projects/stack_llama/scripts/reward_modeling.py).

	info:
	* epoch: 1.0
	* train_loss: 0.641692199903866
	* eval_loss: 0.6299035549163818
	* eval_accuracy: 0.729