Spaces:
Sleeping
Sleeping
Add training script notes to blog
Browse files- blog/blog.md +11 -0
blog/blog.md
CHANGED
|
@@ -26,6 +26,17 @@ The first plain-RL rubric looked promising on paper, but the agent learned a che
|
|
| 26 |
|
| 27 |
The useful jump came from changing the recipe: first teach the model successful trajectories with SFT, then run GRPO on that LoRA. With the updated rubric, the SFT-warm-started agent spends less time gaming the interface and more time doing the real job: find the auth bug, patch it, and keep valid behavior alive.
|
| 28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
## Why OWASP A01?
|
| 30 |
|
| 31 |
The first target is **OWASP A01:2025 - Broken Access Control**.
|
|
|
|
| 26 |
|
| 27 |
The useful jump came from changing the recipe: first teach the model successful trajectories with SFT, then run GRPO on that LoRA. With the updated rubric, the SFT-warm-started agent spends less time gaming the interface and more time doing the real job: find the auth bug, patch it, and keep valid behavior alive.
|
| 28 |
|
| 29 |
+
The plot is backed by repeatable scripts, not a one-off notebook. `scripts/modal_train_sft.py` trains the warm-start LoRA on verified trajectories, `scripts/modal_train_grpo.py` continues from that adapter with live OpenEnv rewards, and `scripts/launch_reward_ablations.ps1` launches comparable rubric trials using the YAML configs in `training/configs/reward_ablations/`.
|
| 30 |
+
|
| 31 |
+
The core handoff is intentionally simple:
|
| 32 |
+
|
| 33 |
+
```bash
|
| 34 |
+
uv run --extra modal modal run --detach scripts/modal_train_sft.py --push-to-hub --detach
|
| 35 |
+
uv run --extra modal modal run --detach scripts/modal_train_grpo.py \
|
| 36 |
+
--initial-adapter-repo-id Humanlearning/CyberSecurity_OWASP-unsloth-gemma-4-e2b-it-sft-lora \
|
| 37 |
+
--max-steps 300 --difficulty 0 --trace-log-every 10 --detach
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
## Why OWASP A01?
|
| 41 |
|
| 42 |
The first target is **OWASP A01:2025 - Broken Access Control**.
|