Team404_FORGE / guide.md
sanjay7676's picture
Final hardening for 100% submission readiness
c2975af

A newer version of the Gradio SDK is available: 6.15.1

Upgrade

FORGE-v4 Submission Guide

Colab-First Commands

  1. Benchmark with model policy:
python train_colab.py --benchmark --policy model --episodes 20
  1. Compare baseline vs model:
python train_colab.py --compare --episodes 20
  1. Top up authentic DPO dataset to 480 pairs:
python train_colab.py --benchmark --policy model --episodes 20 --topup-dpo --target-pairs 480
  1. Verify pair count:
python -c "import pathlib; p=pathlib.Path('data/dpo_dataset.jsonl'); print(sum(1 for _ in p.open('r',encoding='utf-8')) if p.exists() else 0)"

Security Notes

  • API keys must be set via environment variables.
  • No secrets should be hardcoded in source files.
  • Sandbox enforces timeout, memory cap (where supported), blocked risky builtins, and temp cleanup.
  • For public deployment, add container isolation.

What Judges Should See

  • outputs/reward_curve.png
  • outputs/loss_curve.png
  • outputs/pass_rate.png
  • outputs/final_report.json
  • data/dpo_dataset.jsonl with target pair count