Update blog with storytelling introduction and remove slides.pdf

Browse files

Files changed (3) hide show

README.md +1 -3
docs/blog.md +2 -0
docs/slides.pdf +0 -0

README.md CHANGED Viewed

@@ -29,7 +29,6 @@ Humans cannot watch every alert in a Security Operations Center 24/7, and as str
 | **Trained model** — [`shivam2k3/opensoc-defender-grpo`](https://huggingface.co/shivam2k3/opensoc-defender-grpo) | GRPO-trained Qwen2.5-3B-Instruct LoRA defender adapter. |
 | **Training notebook** — [`train_grpo.ipynb`](train_grpo.ipynb) | End-to-end SFT warm-start + GRPO curriculum using Unsloth + TRL. |
 | **Mini-blog** — [`docs/blog.md`](docs/blog.md) | ~600-word write-up of the project. |
-| **Slide deck** — [`docs/slides.pdf`](docs/slides.pdf) | 5 slides; problem → env → results → demo. |
 ## Table of contents
@@ -242,7 +241,7 @@ Mapped to the four judging criteria:
 | Criterion | Weight | Where it lives |
 | --- | ---: | --- |
 | Environment Innovation | 40% | `openenv.yaml`, `schema.py`, `verifier.py`, `env.py`, this README's *Architecture* and *Why the reward cannot be hacked* sections |
-| Storytelling & Presentation | 30% | `/demo` Gradio UI + 90s video + HF blog + 5-slide deck (`docs/slides.pdf`) |
 | Showing Improvement in Rewards | 20% | `eval/results/*.png` (training curves + confusion + headline bar) embedded above |
 | Reward & Training Pipeline | 10% | `rubric.py` + 93-test anti-hack suite + `train_grpo.ipynb` + `scripts/run_full_pipeline.sh` |
@@ -257,7 +256,6 @@ Submission checklist:
 - [x] Eval harness + plotters
 - [x] Pytest suite (93 tests, anti-hack regressions included)
 - [x] Gradio `/demo` UI mounted on the same Space (free-CPU-tier compatible)
-- [x] 5-slide PDF deck (`docs/slides.pdf`)
 - [x] Blog post (`docs/blog.md`)
 - [x] HF Space pushed and **running**: [`shivam2k3/opensoc-env`](https://huggingface.co/spaces/shivam2k3/opensoc-env)
 - [x] SFT adapter trained and pushed: [`opensoc-defender-grpo-sft`](https://huggingface.co/shivam2k3/opensoc-defender-grpo-sft)

 | **Trained model** — [`shivam2k3/opensoc-defender-grpo`](https://huggingface.co/shivam2k3/opensoc-defender-grpo) | GRPO-trained Qwen2.5-3B-Instruct LoRA defender adapter. |
 | **Training notebook** — [`train_grpo.ipynb`](train_grpo.ipynb) | End-to-end SFT warm-start + GRPO curriculum using Unsloth + TRL. |
 | **Mini-blog** — [`docs/blog.md`](docs/blog.md) | ~600-word write-up of the project. |
 ## Table of contents
 | Criterion | Weight | Where it lives |
 | --- | ---: | --- |
 | Environment Innovation | 40% | `openenv.yaml`, `schema.py`, `verifier.py`, `env.py`, this README's *Architecture* and *Why the reward cannot be hacked* sections |
+| Storytelling & Presentation | 30% | `/demo` Gradio UI + 90s video + HF blog |
 | Showing Improvement in Rewards | 20% | `eval/results/*.png` (training curves + confusion + headline bar) embedded above |
 | Reward & Training Pipeline | 10% | `rubric.py` + 93-test anti-hack suite + `train_grpo.ipynb` + `scripts/run_full_pipeline.sh` |
 - [x] Eval harness + plotters
 - [x] Pytest suite (93 tests, anti-hack regressions included)
 - [x] Gradio `/demo` UI mounted on the same Space (free-CPU-tier compatible)
 - [x] Blog post (`docs/blog.md`)
 - [x] HF Space pushed and **running**: [`shivam2k3/opensoc-env`](https://huggingface.co/spaces/shivam2k3/opensoc-env)
 - [x] SFT adapter trained and pushed: [`opensoc-defender-grpo-sft`](https://huggingface.co/shivam2k3/opensoc-defender-grpo-sft)

docs/blog.md CHANGED Viewed

@@ -4,6 +4,8 @@
 ---
 ## Why we built this
 By the time a security operations center (SOC) tier-1 analyst sees an

 ---
+As we all know Anthropic has a new model called Mythos. It is claimed that Mythos has found a lot of bugs in open source projects. As LLMs improve the threat of someone using it for malicious attacks increases exponentially. Also a lot of improvements in LLMs have happened due to Reinforcement Learning. Our goal was to have a multi agents setup with an attacker and defender. They learn from each other. In this way we can improve security and always be prepared in case of an attack.
 ## Why we built this
 By the time a security operations center (SOC) tier-1 analyst sees an

docs/slides.pdf DELETED Viewed

Binary file (85.2 kB)