| | --- |
| | license: mit |
| | base_model: microsoft/Phi-4-reasoning-plus |
| | tags: |
| | - phi-4 |
| | - math |
| | - reasoning |
| | - fine-tuned |
| | - lora |
| | - unsloth |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # Phi-4 Reasoning Plus - Math SFT |
| |
|
| | This model is a Supervised finetuned version of [microsoft/Phi-4-reasoning-plus](https://huggingface.co/microsoft/Phi-4-reasoning-plus) for mathematical reasoning tasks with 30k problems from aime , numina math dataset , and various other problems. |
| |
|
| | ## Training Details |
| |
|
| | - **Base Model**: microsoft/Phi-4-reasoning-plus |
| | - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) with Unsloth |
| | - **LoRA Config**: r=512, alpha=512 |
| | - **Target Modules**: lm_head, o_proj, v_proj, up_proj, down_proj, k_proj, q_proj, gate_proj, embed_tokens |
| | - **Precision**: bfloat16 |
| | |
| | ## Usage |
| | |
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained( |
| | "pragnyanramtha/phi-4-math-rplus", |
| | torch_dtype="auto", |
| | device_map="auto", |
| | trust_remote_code=True, |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained("pragnyanramtha/phi-4-math-rplus") |
| | |
| | # For math problems |
| | messages = [ |
| | {"role": "system", "content": "You are a helpful math assistant. Solve problems step by step."}, |
| | {"role": "user", "content": "What is the sum of the first 100 positive integers?"} |
| | ] |
| | |
| | inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) |
| | outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |