Update blog with storytelling introduction and remove slides.pdf
Browse files- README.md +1 -3
- docs/blog.md +2 -0
- docs/slides.pdf +0 -0
README.md
CHANGED
|
@@ -29,7 +29,6 @@ Humans cannot watch every alert in a Security Operations Center 24/7, and as str
|
|
| 29 |
| **Trained model** β [`shivam2k3/opensoc-defender-grpo`](https://huggingface.co/shivam2k3/opensoc-defender-grpo) | GRPO-trained Qwen2.5-3B-Instruct LoRA defender adapter. |
|
| 30 |
| **Training notebook** β [`train_grpo.ipynb`](train_grpo.ipynb) | End-to-end SFT warm-start + GRPO curriculum using Unsloth + TRL. |
|
| 31 |
| **Mini-blog** β [`docs/blog.md`](docs/blog.md) | ~600-word write-up of the project. |
|
| 32 |
-
| **Slide deck** β [`docs/slides.pdf`](docs/slides.pdf) | 5 slides; problem β env β results β demo. |
|
| 33 |
|
| 34 |
## Table of contents
|
| 35 |
|
|
@@ -242,7 +241,7 @@ Mapped to the four judging criteria:
|
|
| 242 |
| Criterion | Weight | Where it lives |
|
| 243 |
| --- | ---: | --- |
|
| 244 |
| Environment Innovation | 40% | `openenv.yaml`, `schema.py`, `verifier.py`, `env.py`, this README's *Architecture* and *Why the reward cannot be hacked* sections |
|
| 245 |
-
| Storytelling & Presentation | 30% | `/demo` Gradio UI + 90s video + HF blog
|
| 246 |
| Showing Improvement in Rewards | 20% | `eval/results/*.png` (training curves + confusion + headline bar) embedded above |
|
| 247 |
| Reward & Training Pipeline | 10% | `rubric.py` + 93-test anti-hack suite + `train_grpo.ipynb` + `scripts/run_full_pipeline.sh` |
|
| 248 |
|
|
@@ -257,7 +256,6 @@ Submission checklist:
|
|
| 257 |
- [x] Eval harness + plotters
|
| 258 |
- [x] Pytest suite (93 tests, anti-hack regressions included)
|
| 259 |
- [x] Gradio `/demo` UI mounted on the same Space (free-CPU-tier compatible)
|
| 260 |
-
- [x] 5-slide PDF deck (`docs/slides.pdf`)
|
| 261 |
- [x] Blog post (`docs/blog.md`)
|
| 262 |
- [x] HF Space pushed and **running**: [`shivam2k3/opensoc-env`](https://huggingface.co/spaces/shivam2k3/opensoc-env)
|
| 263 |
- [x] SFT adapter trained and pushed: [`opensoc-defender-grpo-sft`](https://huggingface.co/shivam2k3/opensoc-defender-grpo-sft)
|
|
|
|
| 29 |
| **Trained model** β [`shivam2k3/opensoc-defender-grpo`](https://huggingface.co/shivam2k3/opensoc-defender-grpo) | GRPO-trained Qwen2.5-3B-Instruct LoRA defender adapter. |
|
| 30 |
| **Training notebook** β [`train_grpo.ipynb`](train_grpo.ipynb) | End-to-end SFT warm-start + GRPO curriculum using Unsloth + TRL. |
|
| 31 |
| **Mini-blog** β [`docs/blog.md`](docs/blog.md) | ~600-word write-up of the project. |
|
|
|
|
| 32 |
|
| 33 |
## Table of contents
|
| 34 |
|
|
|
|
| 241 |
| Criterion | Weight | Where it lives |
|
| 242 |
| --- | ---: | --- |
|
| 243 |
| Environment Innovation | 40% | `openenv.yaml`, `schema.py`, `verifier.py`, `env.py`, this README's *Architecture* and *Why the reward cannot be hacked* sections |
|
| 244 |
+
| Storytelling & Presentation | 30% | `/demo` Gradio UI + 90s video + HF blog |
|
| 245 |
| Showing Improvement in Rewards | 20% | `eval/results/*.png` (training curves + confusion + headline bar) embedded above |
|
| 246 |
| Reward & Training Pipeline | 10% | `rubric.py` + 93-test anti-hack suite + `train_grpo.ipynb` + `scripts/run_full_pipeline.sh` |
|
| 247 |
|
|
|
|
| 256 |
- [x] Eval harness + plotters
|
| 257 |
- [x] Pytest suite (93 tests, anti-hack regressions included)
|
| 258 |
- [x] Gradio `/demo` UI mounted on the same Space (free-CPU-tier compatible)
|
|
|
|
| 259 |
- [x] Blog post (`docs/blog.md`)
|
| 260 |
- [x] HF Space pushed and **running**: [`shivam2k3/opensoc-env`](https://huggingface.co/spaces/shivam2k3/opensoc-env)
|
| 261 |
- [x] SFT adapter trained and pushed: [`opensoc-defender-grpo-sft`](https://huggingface.co/shivam2k3/opensoc-defender-grpo-sft)
|
docs/blog.md
CHANGED
|
@@ -4,6 +4,8 @@
|
|
| 4 |
|
| 5 |
---
|
| 6 |
|
|
|
|
|
|
|
| 7 |
## Why we built this
|
| 8 |
|
| 9 |
By the time a security operations center (SOC) tier-1 analyst sees an
|
|
|
|
| 4 |
|
| 5 |
---
|
| 6 |
|
| 7 |
+
As we all know Anthropic has a new model called Mythos. It is claimed that Mythos has found a lot of bugs in open source projects. As LLMs improve the threat of someone using it for malicious attacks increases exponentially. Also a lot of improvements in LLMs have happened due to Reinforcement Learning. Our goal was to have a multi agents setup with an attacker and defender. They learn from each other. In this way we can improve security and always be prepared in case of an attack.
|
| 8 |
+
|
| 9 |
## Why we built this
|
| 10 |
|
| 11 |
By the time a security operations center (SOC) tier-1 analyst sees an
|
docs/slides.pdf
DELETED
|
Binary file (85.2 kB)
|
|
|