WildFIRE-FM
WildFIRE-FM is a wildfire-specialized regional reference backbone for 12-hour gridded wildfire occupancy prediction on a 5 km California grid. It is released with five seeded PyTorch checkpoints, model code, final-paper figure previews, numeric summaries, and data-source notes. The raw data are not redistributed.
The model is intended as a reproducible reference backbone for fixed-contract wildfire evaluation, not as a general global wildfire forecasting product. It was trained with regional weather, active-fire supervision, static fuel/canopy/exposure layers, and event-level wildfire resources used by supporting tasks in the paper.
Release Contents
Weights. Five seeded checkpoints are available at models/wildfire_fm/checkpoints/seed_*/best_firms_prauc.pt. Each file is listed with SHA-256 and byte size in models/wildfire_fm/checkpoint_manifest.json.
Model code. The compact U-Net definition is provided in models/wildfire_fm/modeling_unet.py, with a short loading example below.
Evaluation artifacts. A compiled paper PDF, final-paper figure previews, and sanitized compact CSV/JSON summaries are included under paper/, assets/, paper_outputs/, and artifacts/results/. Manuscript TeX, BibTeX, and TikZ source files are intentionally not included in this model release.
Data notes. Data sources and access entry points are documented in data_sources/DATA_SOURCES.md; users must obtain source data from the original providers.
Model Details
| Field | Value |
|---|---|
| Task | 12-hour gridded wildfire occupancy prediction |
| Grid | California regional grid, 5 km, EPSG:5070 |
| Inputs | 16 channels: weather fields, validity masks, static fuel/canopy/exposure layers |
| Architecture | Compact U-Net with occupancy and auxiliary spatial-support heads |
| Training split | June-August 2024 train, September 2024 validation, October 2024 test |
| Released seeds | 1, 7, 42, 99, 123 |
Quick Load
import torch
from models.wildfire_fm.modeling_unet import UNetSmallFlex
model = UNetSmallFlex(
in_ch=16,
base=32,
dropout=0.1,
norm_type="group",
norm_groups=8,
use_aux_spatial_head=True,
)
checkpoint = torch.load(
"models/wildfire_fm/checkpoints/seed_1/best_firms_prauc.pt",
map_location="cpu",
)
state = checkpoint.get("model", checkpoint)
model.load_state_dict(state)
model.eval()
The checkpoint expects the same 16-channel gridded input described in the paper and in data_sources/DATA_SOURCES.md. This repository does not include raw HRRR, FIRMS, LANDFIRE, WRC, LandScan, WFIGS, MTBS, or comparator feature caches.
Evaluation Snapshot
The paper evaluates WildFIRE-FM and ten Earth-FM comparators under fixed task contracts. The top card reports the best final-paper mean for each displayed task contract, with the winning backbone named in the card. The corresponding values are:
- Occupancy union F1:
60.1506 ± 7.5865percent, ClimaX. - Fire-spread spatial F1:
80.9700 ± 2.0200percent, WildFIRE-FM. - Final burned-area log-RMSE:
1.1657 ± 0.0126, WildFIRE-FM; lower is better. - Analog retrieval nDCG@10:
0.5099 ± 0.0336, WildFIRE-FM. - Smoke PM2.5 RMSE:
4.4403 ± 0.0488, AlphaEarth; lower is better. - Extreme-heat RMSE-C:
0.2179 ± 0.0043, WildFIRE-FM; lower is better.
The compiled paper PDF is available at paper/wildfire_fm_evaluation_contracts.pdf. The public release also includes sanitized CSV/JSON summaries used to audit the displayed values. Manuscript table TeX is not included.
Fixed-Contract Checks From The Final Paper
Head-selection regret. This final-paper figure shows that choosing a lightweight head by a ranking metric can lose decision performance under the same frozen features.
Supporting-task rank map. This final-paper figure shows that model ordering changes across burned area, analog retrieval, smoke PM2.5, and extreme heat task contracts.
Primary-task rank changes. This final-paper figure summarizes rank changes across fixed primary-task contracts.
Data Sources
The study uses public or provider-hosted resources, but the processed data are not bundled here:
- NOAA HRRR fields for regional weather inputs.
- NASA FIRMS active-fire detections for occupancy supervision.
- LANDFIRE fuel and canopy layers for static landscape context.
- Wildfire Risk to Communities housing density and LandScan population for exposure context.
- WFIGS and MTBS event-level resources for burned-area and analog tasks.
- External Earth-FM/backbone assets for comparator features.
See data_sources/DATA_SOURCES.md for source roles and access links.
Reproducing Released Paper Outputs
The lightweight check verifies the released sanitized artifacts from compact summaries. It does not require raw data or GPUs.
python3 scripts/reproduce_paper_outputs.py
Full raw-data reruns require separately downloaded source data, local feature caches, and cluster-specific paths. Sanitized reference scripts and a Slurm template are provided under experiments/.
Repository Layout
models/wildfire_fm/ model code, manifests, and checkpoint metadata
paper/ compiled paper PDF only; no TeX source
paper_outputs/ final-paper figure PDFs retained for reproducibility
artifacts/results/ sanitized compact CSV/JSON summaries for released outputs
experiments/ sanitized raw-rerun references and Slurm template
data_sources/ source-data roles and access notes
scripts/ artifact verification and figure/table rebuild helpers
Limitations
WildFIRE-FM is a regional reference model trained for the paper's fixed-contract comparisons. Use outside the California regional grid requires new preprocessing, validation, and contract-specific evaluation. The repository does not provide operational alerts, raw data, or third-party comparator weights.
Citation
@misc{wildfire_fm_evaluation_contracts_2026,
title = {Does Your Wildfire Prediction Model Actually Work, or Just Score Well?},
author = {Yangshuang Xu and Yuyang Dai and Liling Chang and Qi Wang and Yushun Dong},
year = {2026},
note = {WildFIRE-FM model and fixed-contract wildfire evaluation artifacts}
}
The citation will be updated with arXiv metadata after the preprint is public.


