ConicCat
/

Apriel-R1P

Model card Files Files and versions

Apriel-R1P / README.md

ConicCat's picture

Update README.md

124084b verified 12 months ago

|

history blame contribute delete

596 Bytes

	---
	license: mit
	base_model:
	- ServiceNow-AI/Apriel-Nemotron-15b-Thinker
	new_version: ConicCat/Apriel-R1PV.2-NoThink
	---
	Quick and dirty roleplayfinetune of Apriel, using an improved dataset produced by scoring all replies with a Reward model, then discarding scores <5/5.

	Tried to filter for impersonation as well, but Llama 8B was too stupid.

	Seems to like really low temp ~.4 and a touch of DRY .8.

	Uses a [super funky](https://huggingface.co/ConicCat/Apriel-R1P/blob/main/R1P.json) variant of the Phi template b/c that's what the model seems to like best even though I tuned it on mistral.