File size: 596 Bytes
617ad1f f9dede2 58b302d e9e9cc3 063a0fe e9e9cc3 063a0fe a76279f c05d4c6 124084b | 1 2 3 4 5 6 7 8 9 10 11 12 13 | ---
license: mit
base_model:
- ServiceNow-AI/Apriel-Nemotron-15b-Thinker
new_version: ConicCat/Apriel-R1PV.2-NoThink
---
Quick and dirty roleplayfinetune of Apriel, using an improved dataset produced by scoring all replies with a Reward model, then discarding scores <5/5.
Tried to filter for impersonation as well, but Llama 8B was too stupid.
Seems to like really low temp ~.4 and a touch of DRY .8.
Uses a [super funky](https://huggingface.co/ConicCat/Apriel-R1P/blob/main/R1P.json) variant of the Phi template b/c that's what the model seems to like best even though I tuned it on mistral. |