Spaces:

sanjay7676
/

Team404_FORGE

Sleeping

Final hardening for 100% submission readiness

c2975af about 1 month ago

1.04 kB

	# FORGE-v4 Submission Guide

	## Colab-First Commands

	1. Benchmark with model policy:

	```bash
	python train_colab.py --benchmark --policy model --episodes 20
	```

	2. Compare baseline vs model:

	```bash
	python train_colab.py --compare --episodes 20
	```

	3. Top up authentic DPO dataset to 480 pairs:

	```bash
	python train_colab.py --benchmark --policy model --episodes 20 --topup-dpo --target-pairs 480
	```

	4. Verify pair count:

	```bash
	python -c "import pathlib; p=pathlib.Path('data/dpo_dataset.jsonl'); print(sum(1 for _ in p.open('r',encoding='utf-8')) if p.exists() else 0)"
	```

	## Security Notes

	- API keys must be set via environment variables.
	- No secrets should be hardcoded in source files.
	- Sandbox enforces timeout, memory cap (where supported), blocked risky builtins, and temp cleanup.
	- For public deployment, add container isolation.

	## What Judges Should See

	- outputs/reward_curve.png
	- outputs/loss_curve.png
	- outputs/pass_rate.png
	- outputs/final_report.json
	- data/dpo_dataset.jsonl with target pair count