Publish compiled paper PDF without manuscript source

0847cd0 verified 1 day ago

6.92 kB

	---
	license: mit
	tags:
	- wildfire
	- geospatial
	- weather
	- earth-observation
	- foundation-models
	- evaluation
	- pytorch
	pipeline_tag: image-segmentation
	library_name: pytorch
	pretty_name: WildFIRE-FM
	---

	# WildFIRE-FM

	![WildFIRE-FM summary](assets/wildfire_fm_model_card.svg)

	WildFIRE-FM is a wildfire-specialized regional reference backbone for 12-hour gridded wildfire occupancy prediction on a 5 km California grid. It is released with five seeded PyTorch checkpoints, model code, final-paper figure previews, numeric summaries, and data-source notes. The raw data are not redistributed.

	The model is intended as a reproducible reference backbone for fixed-contract wildfire evaluation, not as a general global wildfire forecasting product. It was trained with regional weather, active-fire supervision, static fuel/canopy/exposure layers, and event-level wildfire resources used by supporting tasks in the paper.

	## Release Contents

	![Release contents](assets/release_contents.svg)

	Weights. Five seeded checkpoints are available at `models/wildfire_fm/checkpoints/seed_*/best_firms_prauc.pt`. Each file is listed with SHA-256 and byte size in `models/wildfire_fm/checkpoint_manifest.json`.

	Model code. The compact U-Net definition is provided in `models/wildfire_fm/modeling_unet.py`, with a short loading example below.

	Evaluation artifacts. A compiled paper PDF, final-paper figure previews, and sanitized compact CSV/JSON summaries are included under `paper/`, `assets/`, `paper_outputs/`, and `artifacts/results/`. Manuscript TeX, BibTeX, and TikZ source files are intentionally not included in this model release.

	Data notes. Data sources and access entry points are documented in `data_sources/DATA_SOURCES.md`; users must obtain source data from the original providers.

	## Model Details

	\| Field \| Value \|
	\|---\|---\|
	\| Task \| 12-hour gridded wildfire occupancy prediction \|
	\| Grid \| California regional grid, 5 km, EPSG:5070 \|
	\| Inputs \| 16 channels: weather fields, validity masks, static fuel/canopy/exposure layers \|
	\| Architecture \| Compact U-Net with occupancy and auxiliary spatial-support heads \|
	\| Training split \| June-August 2024 train, September 2024 validation, October 2024 test \|
	\| Released seeds \| 1, 7, 42, 99, 123 \|

	## Quick Load

	```python
	import torch
	from models.wildfire_fm.modeling_unet import UNetSmallFlex

	model = UNetSmallFlex(
	in_ch=16,
	base=32,
	dropout=0.1,
	norm_type="group",
	norm_groups=8,
	use_aux_spatial_head=True,
	)
	checkpoint = torch.load(
	"models/wildfire_fm/checkpoints/seed_1/best_firms_prauc.pt",
	map_location="cpu",
	)
	state = checkpoint.get("model", checkpoint)
	model.load_state_dict(state)
	model.eval()
	```

	The checkpoint expects the same 16-channel gridded input described in the paper and in `data_sources/DATA_SOURCES.md`. This repository does not include raw HRRR, FIRMS, LANDFIRE, WRC, LandScan, WFIGS, MTBS, or comparator feature caches.

	## Evaluation Snapshot

	The paper evaluates WildFIRE-FM and ten Earth-FM comparators under fixed task contracts. The top card reports the best final-paper mean for each displayed task contract, with the winning backbone named in the card. The corresponding values are:

	- Occupancy union F1: `60.1506 ± 7.5865` percent, ClimaX.
	- Fire-spread spatial F1: `80.9700 ± 2.0200` percent, WildFIRE-FM.
	- Final burned-area log-RMSE: `1.1657 ± 0.0126`, WildFIRE-FM; lower is better.
	- Analog retrieval nDCG@10: `0.5099 ± 0.0336`, WildFIRE-FM.
	- Smoke PM2.5 RMSE: `4.4403 ± 0.0488`, AlphaEarth; lower is better.
	- Extreme-heat RMSE-C: `0.2179 ± 0.0043`, WildFIRE-FM; lower is better.

	The compiled paper PDF is available at `paper/wildfire_fm_evaluation_contracts.pdf`. The public release also includes sanitized CSV/JSON summaries used to audit the displayed values. Manuscript table TeX is not included.

	### Fixed-Contract Checks From The Final Paper

	Head-selection regret. This final-paper figure shows that choosing a lightweight head by a ranking metric can lose decision performance under the same frozen features.

	![Head-selection regret](assets/selection_regret_final.png)

	Supporting-task rank map. This final-paper figure shows that model ordering changes across burned area, analog retrieval, smoke PM2.5, and extreme heat task contracts.

	![Supporting task rank map](assets/supporting_rank_map_final.png)

	Primary-task rank changes. This final-paper figure summarizes rank changes across fixed primary-task contracts.

	![Primary rank changes](assets/primary_rank_change_final.png)

	## Data Sources

	The study uses public or provider-hosted resources, but the processed data are not bundled here:

	- NOAA HRRR fields for regional weather inputs.
	- NASA FIRMS active-fire detections for occupancy supervision.
	- LANDFIRE fuel and canopy layers for static landscape context.
	- Wildfire Risk to Communities housing density and LandScan population for exposure context.
	- WFIGS and MTBS event-level resources for burned-area and analog tasks.
	- External Earth-FM/backbone assets for comparator features.

	See `data_sources/DATA_SOURCES.md` for source roles and access links.

	## Reproducing Released Paper Outputs

	The lightweight check verifies the released sanitized artifacts from compact summaries. It does not require raw data or GPUs.

	```bash
	python3 scripts/reproduce_paper_outputs.py
	```

	Full raw-data reruns require separately downloaded source data, local feature caches, and cluster-specific paths. Sanitized reference scripts and a Slurm template are provided under `experiments/`.

	## Repository Layout

	```text
	models/wildfire_fm/ model code, manifests, and checkpoint metadata
	paper/ compiled paper PDF only; no TeX source
	paper_outputs/ final-paper figure PDFs retained for reproducibility
	artifacts/results/ sanitized compact CSV/JSON summaries for released outputs
	experiments/ sanitized raw-rerun references and Slurm template
	data_sources/ source-data roles and access notes
	scripts/ artifact verification and figure/table rebuild helpers
	```

	## Limitations

	WildFIRE-FM is a regional reference model trained for the paper's fixed-contract comparisons. Use outside the California regional grid requires new preprocessing, validation, and contract-specific evaluation. The repository does not provide operational alerts, raw data, or third-party comparator weights.

	## Citation

	```bibtex
	@misc{wildfire_fm_evaluation_contracts_2026,
	title = {Does Your Wildfire Prediction Model Actually Work, or Just Score Well?},
	author = {Yangshuang Xu and Yuyang Dai and Liling Chang and Qi Wang and Yushun Dong},
	year = {2026},
	note = {WildFIRE-FM model and fixed-contract wildfire evaluation artifacts}
	}
	```

	The citation will be updated with arXiv metadata after the preprint is public.