Spaces:

AlgoCore
/

support-ticket-env

Sleeping

Vighnesh commited on Apr 26

Commit

7bdf1e0

1 Parent(s): b523c77

add wandb training logs link

Files changed (1) hide show

Blog.md CHANGED Viewed

@@ -125,6 +125,8 @@ Task 3 more than tripled its score, demonstrating that even a 0.5B model can lea
 Overall, the agent nearly doubled its score from 0.289 to 0.569 — a **+0.280 improvement purely from GRPO with no supervised labels**.
 ---
 ## What Made It Work

 Overall, the agent nearly doubled its score from 0.289 to 0.569 — a **+0.280 improvement purely from GRPO with no supervised labels**.
+- **Training Logs:** [WandB Run](https://wandb.ai/vighneshdev1990-/support-ticket-grpo/runs/33q716zb)
 ---
 ## What Made It Work