Spaces:
Sleeping
Sleeping
| ## CommitGuard: project context (load this first) | |
| This file is the **single source of truth for agents**. It compresses `../prd.md` into must-know facts so you can make correct decisions at 3 AM. | |
| If youre unsure: re-read `../prd.md` and then update this file to match. | |
| ## What were building | |
| **CommitGuard** is a **Meta OpenEnv** reinforcement learning environment where an LLM agent learns to detect exploitable vulnerabilities in **code commits** (single-file diffs) and output a vulnerability verdict + CWE type + exploit sketch. | |
| The environment runs as an **HTTP server (FastAPI in Docker)**, hosted on **Hugging Face Spaces**. Training runs with **TRL GRPO + Unsloth** on **Llama3.23BInstruct**, using verifiable rewards from dataset ground truth (RLVR). | |
| ## Why this matters (the thesis) | |
| AI writes code at AI speed. Security review still runs on human cycles. Offense can now scale with the same LLM tooling. **Were building the RL environment that trains AI-paced commit-time security review.** | |
| ## Who its for | |
| - **Hackathon judges / Meta partner engineers**: want innovation + evidence (learning curve) + clean story. | |
| - **Meta researchers**: want RLVR framing, cheating-prevention, and extensibility. | |
| - **HF community**: wants a runnable Space + reproducible training notebook. | |
| ## 30-second pitch (verbatim; memorize) | |
| > "AI is now writing production code at AI speed. Security review still runs on a 6-month human cycle. The same LLMs that write the code can attack it defense is on human time, offense is on AI time, and that asymmetry breaks the security model. | |
| > | |
| > CommitGuard is an OpenEnv where an agent learns to flag exploitable diffs at commit time. We trained Llama-3.2-3B on it via GRPO and the detection rate climbs measurably. It's RLVR verifiable rewards from ground truth, not LLM judges. The thesis: continuous AI red-teaming at the velocity code is being shipped. This is the environment to train it." | |
| ## Locked stack (do not change) | |
| - **Env framework**: Meta OpenEnv **0.2.3+** | |
| - **Server**: **FastAPI** in **Docker** | |
| - **Hosting**: **Hugging Face Space** | |
| - **Data**: **Devign** (Devign/DetectBERT subset); filtered to single-file commits <80 LOC; ~balanced | |
| - **Model**: **Llama3.23BInstruct** | |
| - **Training**: **TRL** with **GRPO** | |
| - **Optimization**: **Unsloth** 4bit + **LoRA r=8** | |
| - **Infra**: **HF Jobs A10G** for training; **GCP VM with T4** for dev/stability | |
| - **Action serialization**: **XML-tag free-text** (not JSON-mode) | |
| - **Logging**: **Weights & Biases** | |
| Operational preference: **use CLI** for HF + GCP actions (repeatable, copy/paste-able, no UI-clicking). | |
| ## Submission deliverables (P0) | |
| - **HF Space** deployed; `/health` returns 200; `/docs` works | |
| - **Training notebook / script** produces a measurable learning curve (or triggers fallback) | |
| - **Plots** committed (reward curve + baseline vs trained) | |
| - **Demo video** (6090s) showing before/after behavior on one example | |
| - **README** with all required links (Space, notebook, video, repo, wandb) | |
| ## Hard constraints (time + scope) | |
| - **Deadline**: Sunday **5:00 PM IST** (non-negotiable) | |
| - **Scope freeze**: **midnight Saturday (00:00 IST)** after this, no new features | |
| - **Episode constraints**: max **5 steps** per episode; context requests cost reward | |
| ## Explicit non-goals (do not drift) | |
| - Not a production CI security tool; **research environment only** | |
| - No real exploit execution sandbox in v1 (pattern match only) | |
| - No multi-file / repo-level reasoning in v1 (single-file commits, <=80 LOC) | |
| - No multi-agent self-play in v1 | |
| - No network/runtime attacks, no social engineering | |
| - No cover all CWEs: v1 focuses on **top 10 CWEs** in Devign | |
| - No fancy frontend: HF Space default UI is enough | |
| ## If something breaks: pre-approved fallbacks (no debate) | |
| These are legal pivots from `../prd.md` 7.2. If trigger happens, switch immediately and log it in `decision_log.md`. | |
| - **OOM on Llama3.23B on A10G** use **Qwen2.51.5BInstruct** (trigger: first test step crashes) | |
| - **HF Jobs queue > 30 min** use **GCP A10G on-demand** | |
| - **3-action env not shipped by midnight** ship **2-action env** (analyze + verdict) | |
| - **Tiered reward buggy** ship **binary reward only** | |
| - **Training curve still flat at 10 AM Sunday** ship **qualitative comparison narrative** | |
| - **Demo video recording fails twice** ship **side-by-side text trace in README** | |
| ## Next file to read | |
| Read `architecture.md` next. Then read your per-person task list (e.g. `../tasks_niti.md`) if present. | |