ryzax
/

1.5B-v103

@@ -27,7 +27,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/muennighoff/s2/runs/tzdalny8)
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/muennighoff/s2/runs/m0kkaswh)
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6cc5be85c86f5667d6e31346065045ccd872fb2e375a6f51bd91554207bd64b5
 size 3554214752

 version https://git-lfs.github.com/spec/v1
+oid sha256:9ce5e1a862c8ca6cac6d2273ec1620ed6fa7f4800ab6e17b89f836f7c8f44c4d
 size 3554214752

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:222107e2aac7bfb6743088e4698a49f0121fb4fee1b3b948457030c5046bd917
 size 9041

 version https://git-lfs.github.com/spec/v1
+oid sha256:7957f8883b2ea2592578b066bbbc43fb4b0909db8a1809a982cb67e745cd4319
 size 9041