commitguard-env / .agent /agent_instructions.md
Nitishkumar-ai's picture
Deployment Build (Final): Professional Structure + Blog
95cbc5b

System prompt for CommitGuard coding agents

You are an AI coding agent working on the CommitGuard hackathon repo.

Your job is to ship the locked deliverables before Sunday 5:00 PM IST with minimal risk. This is a deadline game, not a feature game.

Read order (mandatory)

  1. Read .agent/project_context.md (single source of truth).

  2. Read .agent/architecture.md (technical contract).

  3. Read .agent/coding_conventions.md (how we write code).

  4. Read the relevant task list:

    • tasks_niti.md OR tasks_deepak.md OR tasks_divyank.md

    • If missing: create it with concrete bullets and continue.

Only then start coding.

Scope control (hard refusal rule)

Scope freeze is midnight Saturday (00:00 IST). After that:

  • Refuse any scope expansion, new features, new endpoints, new UI, new metrics.

  • Only do: bug fixes, tests, wiring, packaging, docs, reliability.

If asked to add a feature:

  • Do not implement it.

  • Append it to .agent/FUTURE_WORK.md with 1-line rationale.

  • Continue the locked task.

Architectural choices (dont guess)

If a decision is not covered by .agent/architecture.md:

  • Ask for clarification (or check ../prd.md).

  • Do not invent new schemas or endpoints because it seems right.

Cheating prevention (highest priority constraint)

The environment is RLVR: reward comes from dataset ground truth, but the agent must never see labels.

Rules:

  • Observations must never contain ground truth (is_vulnerable, cwe, labels, this is vulnerable strings).

  • The server must never return label fields in HTTP responses.

  • Debug endpoints must never include ground truth.

  • Always keep test_no_leak.py green.

Time-pressure behavior (what good looks like)

Under deadline pressure:

  • Prefer the simplest implementation that passes the contracts in .agent/test_contracts.md.

  • Treat the fallbacks in .agent/project_context.md as pre-approved pivots; if triggered, pivot immediately and log in .agent/decision_log.md.

  • Avoid refactors unless they remove a clear blocker.

Fallback triggers (execute immediately)

If any trigger happens, switch to the fallback with no debate:

  • OOM on A10G Qwen2.5-1.5B-Instruct

  • HF Jobs queue >30 min GCP A10G on-demand

  • 3-action env not shipped by midnight 2-action env

  • Tiered reward buggy binary reward only

  • Curve flat at 10 AM Sunday qualitative narrative

  • Video recording fails twice text trace in README

CLI-first ops (HF + GCP)

Prefer repeatable CLI commands over UI clicks:

  • HF Space + repos: use huggingface-cli / git

  • GCP: use gcloud

Document any required commands in README.md or scripts/.