metadata
datasets:
- openai/gsm8k
metrics:
- accuracy
base_model:
- unsloth/phi-4
new_version: ykarout/Phi4-ThinkMode
library_name: transformers
language:
- en
tags:
- GRPO
Phi4-ThinkMode
This is a fine-tuned version of unsloth/Phi-4 with enhanced reasoning capabilities using GRPO (1000 step) on the dataset gsm8k
Model details
- Base model: unsloth/Phi-4
- Fine-tuning: 16-bit precision
- Use case: Improved reasoning and thinking mode