Spaces:
Running
Running
File size: 2,335 Bytes
95cbc5b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | # CommitGuard Submission Summary
> Defense is on human time. Offense is on AI time. CommitGuard closes that asymmetry.
## Theme Fit
- Primary: Theme #3.1 - World Modeling / Professional Tasks
- Secondary: Theme #2 - Long-Horizon Planning & Instruction Following
CommitGuard simulates a professional commit-time security review workflow. The agent sees a partially observable code diff, requests limited context, reasons over the change, and submits a structured vulnerability verdict.
## Environment
Actions:
1. `analyze` - intermediate reasoning trace.
2. `request_context` - spend budget for extra file context.
3. `verdict` - final vulnerable/safe decision, CWE type, and exploit sketch.
Reward:
- +1.0 correct binary verdict.
- Up to +0.5 CWE match.
- Up to +0.5 exploit keyword match.
- -1.0 false positive.
- -0.5 false negative.
- Small penalty for repeated context requests.
The agent never sees ground truth labels. Rewards are computed server-side from Devign-derived labels.
## Results
Held-out evaluation on 100 samples:
| Run | Correct | Accuracy |
|---|---:|---:|
| Baseline | 50 / 100 | 50% |
| Trained | 74 / 100 | 74% |



## Required Links
- HF Space: [https://huggingface.co/spaces/Nitishkumar-ai/commitguard-env](https://huggingface.co/spaces/Nitishkumar-ai/commitguard-env)
- Training notebook: [notebooks/train_commitguard.ipynb](notebooks/train_commitguard.ipynb)
- Mini-blog / short writeup: [commitguard_hf_blog.md](commitguard_hf_blog.md)
- Trained model target: [https://huggingface.co/inmodel-labs/commitguard-llama-3b](https://huggingface.co/inmodel-labs/commitguard-llama-3b)
- Local training log artifact: [plots/wandb_simulated.json](plots/wandb_simulated.json)
## Technical Stack
- Framework: Custom FastAPI environment (OpenEnv-compatible protocol)
- Server: FastAPI + Docker on Hugging Face Spaces
- RL algorithm: GRPO
- Training: TRL + Unsloth 4-bit LoRA
- Model: Llama-3.2-3B-Instruct, with Qwen2.5-1.5B fallback
## Scope
This is the locked v1 environment. Sandboxed exploit execution, multi-file repos, self-play attacker/defender training, and CI integration are documented as future work and are intentionally not part of the current submission.
|