OpenRLHF
/

Mistral-7b-PRM-Math-Shepherd

Model card Files Files and versions

Mistral-7b-PRM-Math-Shepherd / README.md

chuyi777's picture

Update README.md

41d1ad8 verified about 1 year ago

|

history blame contribute delete

173 Bytes

	Process Reward Model trained by OpenRLHF

	- Dataset: Math-Shepherd (https://huggingface.co/datasets/peiyi9979/Math-Shepherd)
	- Learning Rate: 1e-6
	- Training Accuracy: 0.922