TNFR-gridmet-OPS β Arizona streamflow forecasting
Informer-style Transformer that predicts 30 days of daily streamflow from 365 days of GRIDMET meteorological forcings + 210 static basin attributes for Arizona USGS gauges.
This is the operational ("OPS") variant β trained on every date where GRIDMET and streamflow both exist (~1979β2023 per station). Companion to the A2A ("apples-to-apples") variant which clips training data to the ERA5 baseline's date grid for clean A/B comparison.
Inputs
| Lookback | 365 days (encoder context) |
| Forecast horizon | 30 days |
| Dynamic features (per day) | pr, rmax, rmin, srad, tmmn, tmmx, vpd, vs (GRIDMET) + streamflow (lagged) |
| Static features (per basin) | 210 (Caravan + HydroATLAS) |
| Total feature dim | 219 |
Architecture
| Class | Models.Transformer.Model (Informer-style encoder/decoder) |
d_model |
256 |
| Heads | 8 |
| Encoder layers | 3 |
| Decoder layers | 2 |
| Dropout | 0.3 |
| Total params | ~2.7M |
Test-set performance
Per-station NSE / KGE on the held-out test split (263 stations):
| Lead time | NSE median | KGE median | NSE β₯ 0.5 |
|---|---|---|---|
| Day 1 (next-day) | +0.505 | +0.468 | 50% |
| Day 7 | +0.092 | -0.043 | 16% |
| Day 30 | +0.029 | -0.114 | 11% |
Skill decays sharply with lead time, as expected for daily streamflow forecasting from deterministic weather inputs.
Outliers (very small ephemeral streams with near-zero training-period std) can produce extreme NSE values β the medians and per-bucket fractions above are the honest summaries.
Training
| Data mode | OPS (no date clipping) |
| Trainable basins | 289 |
| Init from | TNFR-gridmet-A2A best.pt (warm-start) |
| Epochs | 1 OPS-specific (after A2A pretraining) |
| Batch size | 64 |
| Optimizer | AdamW, weight_decay=1e-2 |
| LR | 3e-4 with OneCycleLR (pct_start=0.3, cosine anneal) |
| Loss | MaskedHuberLoss (Ξ΄=1.0) β NaN-safe in both target and prediction |
| Seed | 1 |
The single-epoch fine-tune from A2A weights outperformed 27 epochs of OPS training from scratch in side-by-side tests, so this checkpoint represents the best-on-test configuration found during development.
Files
.
βββ README.md # this file
βββ config.json # all hyperparameters + test metrics
βββ model.safetensors # weights (state_dict)
βββ locked_stats_gridmet.json # per-basin normalization stats + static lookups
βββ inference.py # standalone CLI inference
βββ requirements.txt
βββ model_code/ # all Python needed to load the model
βββ Models/Transformer.py
βββ LayersTransformer/{attention,encoder,decoder}.py
βββ utils/{embed,masking,timefeatures}.py
Quick start
pip install -r requirements.txt
# Forecast 30 days of streamflow for a specific USGS site:
python inference.py --site 09380000 --csv path/to/timeseries.csv --out_csv pred.csv
The CSV must have columns date, pr, rmax, rmin, srad, tmmn, tmmx, vpd, vs, streamflow
and at least 365 rows of history. Output is a 30-row CSV with date and
streamflow_pred (in cfs, denormalized).
Loading from Python
from huggingface_hub import snapshot_download
import sys, json, pandas as pd, torch
from safetensors.torch import load_file
repo_dir = snapshot_download("YOUR_HF_REPO/tnfr-gridmet-ops")
sys.path.insert(0, f"{repo_dir}/model_code")
from Models.Transformer import Model
cfg = json.load(open(f"{repo_dir}/config.json"))
state = load_file(f"{repo_dir}/model.safetensors")
# ...build Model with cfg fields, then model.load_state_dict(state)
(Full glue is in inference.py β see load_checkpoint() and forecast().)
Citation
If you use this model in published work, please cite the upstream TADA project and the GRIDMET dataset:
- Abatzoglou, J. T. (2013). Development of gridded surface meteorological data for ecological applications and modelling. International Journal of Climatology.
- (TADA / Global_TADA citation TBD when published.)
Limitations
- Predictions for very small / ephemeral basins (training-period std < ~1 cfs) can be unstable.
- Trained on Arizona only; do not apply to basins outside this domain without retraining.
- Day-7+ horizons have minimal skill β don't rely on this model for medium-range planning.
- Single-seed model; the upstream TADA protocol uses 5-seed ensembling β this model is one seed only.
- Downloads last month
- 16