Model

  • Developed by: DARJYO
  • Base Type: Fine-tuned language model
  • Finetuned model : persadian_14B-GRPO
  • Base Architecture: Transformer-based/Phi-4

This model is fine-tuned on datasets for tasks with Unsloth and Huggingface's TRL library. It is based on the unsloth/Phi-4 model and uses reinforcement learning for improved performance.

Downloads last month
12
GGUF
Hardware compatibility
Log In to view the estimation

We're not able to determine the quantization variants.

Video Preview
loading

Model tree for DARJYO/persadian_14B-GRPO

Base model

microsoft/phi-4
Finetuned
unsloth/phi-4
Quantized
(31)
this model
Quantizations
1 model