CaptainHPY
/

Qwen2.5-7B-R1

Text Generation

text-generation-inference

Model card Files Files and versions

CaptainHPY commited on Sep 17, 2025

Commit

2c94e93

·

verified ·

1 Parent(s): 0620538

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ It has been trained using [TRL](https://github.com/huggingface/trl), [unsloth](h
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/captainhpy-beijing-university-of-technology/tiny-reasoning/runs/9jaq2zri)
 - This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
 - Dataset: [unsloth/OpenMathReasoning-mini](https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini)

 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/captainhpy-beijing-university-of-technology/tiny-reasoning/runs/6zmbkin8)
 - This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
 - Dataset: [unsloth/OpenMathReasoning-mini](https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini)