| --- |
| license: mit |
| base_model: |
| - ServiceNow-AI/Apriel-Nemotron-15b-Thinker |
| new_version: ConicCat/Apriel-R1PV.2-NoThink |
| --- |
| Quick and dirty roleplayfinetune of Apriel, using an improved dataset produced by scoring all replies with a Reward model, then discarding scores <5/5. |
|
|
| Tried to filter for impersonation as well, but Llama 8B was too stupid. |
|
|
| Seems to like really low temp ~.4 and a touch of DRY .8. |
|
|
| Uses a [super funky](https://huggingface.co/ConicCat/Apriel-R1P/blob/main/R1P.json) variant of the Phi template b/c that's what the model seems to like best even though I tuned it on mistral. |