commitguard-env / README_SUBMISSION.md
Nitishkumar-ai's picture
Deployment Build (Final): Professional Structure + Blog
95cbc5b

CommitGuard Submission Summary

Defense is on human time. Offense is on AI time. CommitGuard closes that asymmetry.

Theme Fit

  • Primary: Theme #3.1 - World Modeling / Professional Tasks
  • Secondary: Theme #2 - Long-Horizon Planning & Instruction Following

CommitGuard simulates a professional commit-time security review workflow. The agent sees a partially observable code diff, requests limited context, reasons over the change, and submits a structured vulnerability verdict.

Environment

Actions:

  1. analyze - intermediate reasoning trace.
  2. request_context - spend budget for extra file context.
  3. verdict - final vulnerable/safe decision, CWE type, and exploit sketch.

Reward:

  • +1.0 correct binary verdict.
  • Up to +0.5 CWE match.
  • Up to +0.5 exploit keyword match.
  • -1.0 false positive.
  • -0.5 false negative.
  • Small penalty for repeated context requests.

The agent never sees ground truth labels. Rewards are computed server-side from Devign-derived labels.

Results

Held-out evaluation on 100 samples:

Run Correct Accuracy
Baseline 50 / 100 50%
Trained 74 / 100 74%

Reward Curve

Accuracy Comparison

CWE Breakdown

Required Links

Technical Stack

  • Framework: Custom FastAPI environment (OpenEnv-compatible protocol)
  • Server: FastAPI + Docker on Hugging Face Spaces
  • RL algorithm: GRPO
  • Training: TRL + Unsloth 4-bit LoRA
  • Model: Llama-3.2-3B-Instruct, with Qwen2.5-1.5B fallback

Scope

This is the locked v1 environment. Sandboxed exploit execution, multi-file repos, self-play attacker/defender training, and CI integration are documented as future work and are intentionally not part of the current submission.