Update README.md

e03d788 verified about 1 year ago

444 Bytes

datasets:
  - openai/gsm8k
metrics:
  - accuracy
base_model:
  - unsloth/phi-4
new_version: ykarout/Phi4-ThinkMode
library_name: transformers
language:
  - en
tags:
  - GRPO

Phi4-ThinkMode

This is a fine-tuned version of unsloth/Phi-4 with enhanced reasoning capabilities using GRPO (1000 step) on the dataset gsm8k

Model details

Base model: unsloth/Phi-4
Fine-tuning: 16-bit precision
Use case: Improved reasoning and thinking mode