Text Generation
Transformers
Safetensors
qwen2
unsloth
trl
text-generation-inference
CaptainHPY commited on
Commit
2c94e93
·
verified ·
1 Parent(s): 0620538

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ It has been trained using [TRL](https://github.com/huggingface/trl), [unsloth](h
20
 
21
  ## Training procedure
22
 
23
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/captainhpy-beijing-university-of-technology/tiny-reasoning/runs/9jaq2zri)
24
 
25
  - This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
26
  - Dataset: [unsloth/OpenMathReasoning-mini](https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini)
 
20
 
21
  ## Training procedure
22
 
23
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/captainhpy-beijing-university-of-technology/tiny-reasoning/runs/6zmbkin8)
24
 
25
  - This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
26
  - Dataset: [unsloth/OpenMathReasoning-mini](https://huggingface.co/datasets/unsloth/OpenMathReasoning-mini)