Spaces:
Sleeping
Sleeping
Vighnesh commited on
Commit ·
7bdf1e0
1
Parent(s): b523c77
add wandb training logs link
Browse files
Blog.md
CHANGED
|
@@ -125,6 +125,8 @@ Task 3 more than tripled its score, demonstrating that even a 0.5B model can lea
|
|
| 125 |
|
| 126 |
Overall, the agent nearly doubled its score from 0.289 to 0.569 — a **+0.280 improvement purely from GRPO with no supervised labels**.
|
| 127 |
|
|
|
|
|
|
|
| 128 |
---
|
| 129 |
|
| 130 |
## What Made It Work
|
|
|
|
| 125 |
|
| 126 |
Overall, the agent nearly doubled its score from 0.289 to 0.569 — a **+0.280 improvement purely from GRPO with no supervised labels**.
|
| 127 |
|
| 128 |
+
- **Training Logs:** [WandB Run](https://wandb.ai/vighneshdev1990-/support-ticket-grpo/runs/33q716zb)
|
| 129 |
+
|
| 130 |
---
|
| 131 |
|
| 132 |
## What Made It Work
|