Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.15.1
FORGE-v4 Submission Guide
Colab-First Commands
- Benchmark with model policy:
python train_colab.py --benchmark --policy model --episodes 20
- Compare baseline vs model:
python train_colab.py --compare --episodes 20
- Top up authentic DPO dataset to 480 pairs:
python train_colab.py --benchmark --policy model --episodes 20 --topup-dpo --target-pairs 480
- Verify pair count:
python -c "import pathlib; p=pathlib.Path('data/dpo_dataset.jsonl'); print(sum(1 for _ in p.open('r',encoding='utf-8')) if p.exists() else 0)"
Security Notes
- API keys must be set via environment variables.
- No secrets should be hardcoded in source files.
- Sandbox enforces timeout, memory cap (where supported), blocked risky builtins, and temp cleanup.
- For public deployment, add container isolation.
What Judges Should See
- outputs/reward_curve.png
- outputs/loss_curve.png
- outputs/pass_rate.png
- outputs/final_report.json
- data/dpo_dataset.jsonl with target pair count