Spaces:

Eshit
/

Wildfire-Containment-Simulator

Sleeping

App Files Files Community

Wildfire-Containment-Simulator / training /README.md

Eshit

Deploy to HF Space

363abf3 about 1 month ago

preview code

raw

history blame contribute delete

1.5 kB

GRPO Training — Wildfire Containment Simulator

How to run

Open grpo_colab.ipynb in Colab (T4 GPU runtime) and run cells in order:

Section	Cell(s)	What it does
1 — Setup	1–3	Installs deps, clones repo, loads Qwen-2.5-1.5B with LoRA
2 — Rollout	4–5	Defines `collect_rollout()` using env + serializer + parser
3 — Training	6–8	Builds GRPO dataset, trains 50 steps with curriculum
4 — Checkpointing	9	Saves final adapter, verifies reload
5 — Plot	10	Plots reward curve with tier-promotion markers

Resume from checkpoint: The first cell of Section 3 auto-detects the latest checkpoints/step_* folder and loads it. Re-run from that cell to continue training.

Expected runtime on T4

~45 minutes for 50 GRPO steps (depends on episode length per tier).

Downloading the trained adapter

After training completes, run in a Colab cell:

from google.colab import files
import shutil
shutil.make_archive('wildfire_adapter', 'zip', 'checkpoints/final')
files.download('wildfire_adapter.zip')

Local validation (no GPU needed)

python training/test_notebook_imports.py

This checks all imports and runs a quick env smoke test without loading model weights.