gensim / README.md
tobifinn's picture
Update README.md
c8b3b6d verified
---
license: mit
language:
- en
tags:
- geoscience
- sea_ice
- forecasting
- generative
---
# GenSIM – Generative Sea‑Ice Model
## Model description
GenSIM is a **generative AI‑based pan‑Arctic sea‑ice model** that predicts the evolution of key sea‑ice state variables (concentration, thickness, damage, drift components, and snow‑on‑ice thickness) in a 12‑hour window.
It leverages **censored flow‑matching** and a **scale‑aware transformer** architecture with domain decomposition, enabling fast, memory‑efficient forecasts that remain physically consistent (e.g., non‑negative thickness).
- **Model type:** Flow‑matching transformer with auto-regressive forecasting steps
- **Input:** Initial conditions for sea ice + atmospheric forcings (2 m temperature, specific humidity, 10 m wind components)
- **Output:** Predicted sea‑ice state at *t + 12 h* (or further steps via autoregression)
- **Resolution:** Curvilinear 1/4° mesh (~12 km)
## Architecture
Key components:
| Component | Details |
|-----------|---------|
| **Embedding** | Linear patching (2×2), pseudo‑time, resolution, and augmentation embeddings |
| **Transformer** | 8 blocks of self‑attention with learnable localisation, followed by MLPs |
| **Domain decomposition** | Overlapping sub‑domains processed in parallel, linear scaling with grid size |
| **Censored flow‑matching** | Enforces physical bounds (e.g., non‑negative thickness) via censored Gaussian distributions |
## Training data
- **Source:** 20 years of the neXtSIM‑OPA sea‑ice–ocean simulation (global Arctic)
- **Variables:** Sea‑ice state (`sit`, `sic`, `sid`, `siu`, `siv`, `snt`) and atmospheric forcings (`t2m`, `q2m`, `u10`, `v10`)
## Training procedure
- **Framework:** PyTorch Lightning with Hydra configuration.
- **Optimizer:** AdamW.
- **Training length:** 1 000 000 steps.
## Evaluation
- **Metrics:** RMSE, physical consistency, multi‑decadal climate trend reproduction.
- **Benchmarks:** Compared against deterministic baseline model with better skill in RMSE of the ensemble mean, representation of marginal ice zone and by matching energies at all scales.
- **Ensemble capability:** Low memory footprint (< 4 GB) enables large ensembles and decadal simulations on a single GPU.
## Intended uses & limitations
**Intended uses**
- Short‑term (12 h) sea‑ice forecasts for research and operational settings.
- Generating ensembles for uncertainty quantification.
- Climate‑scale analysis via auto‑regressive roll‑outs.
**Limitations**
- Model is trained on a specific climate regime (historical neXtSIM‑OPA); extrapolation to vastly different forcing scenarios may degrade performance.
- Physical realism is limited to variables present in the training data; oceanic processes are inferred implicitly, not modeled explicitly.
- Predictions are only as reliable as the atmospheric forcing inputs.
## How to load the model
The repository provides two checkpoint files in **safetensors** format:
- `model_weights.safetensors` – non‑EMA (standard) weights
- `model_weights_ema.safetensors` – Exponential Moving Average (EMA) weights
Both can be loaded with the `FlowMatchingModel` class:
```python
import torch
from safetensors import safe_open
from gensim.network import Transformer
# Initialise the model (use the same hyper‑parameters as in config.yaml)
model = Transformer(
n_input=..., # e.g., number of input channels (see config.yaml)
n_output=..., # e.g., number of output channels
n_features=..., # model dimension
n_blocks=..., # transformer depth
# any other kwargs required by the class
)
# Choose which checkpoint to load
checkpoint_path = "model_weights_ema.safetensors" # or "model_weights.safetensors"
# Load weights from savetensor
with safe_open(checkpoint_path, framework="pt", device="cuda") as f:
saved_keys = list(f.keys())
# Get the network state dict
network_state_dict = model.state_dict()
# Update the network state dict with the weights from the tensor
network_state_dict.update({key: f.get_tensor(key) for key in saved_keys})
# Load the updated state dict into the network
model.load_state_dict(network_state_dict)
# Helpful message if keys are missing
missing_keys = [k for key in network_state_dict.keys() if key not in saved_keys]
if missing_keys:
print("Missing keys in loaded weights:", missing_keys)
# Set to eval mode
model.eval()
```
> **Note:** The exact constructor arguments (`n_input`, `n_output`, `n_features`, `n_blocks`, …) can be found in `config.yaml`. Adjust them to match the checkpoint you load.
## Installation
```bash
git clone https://github.com/cerea-daml/gensim.git
cd gensim
conda env create -f environment.yml
conda activate gensim
pip install -e .
```
Verify installation:
```bash
python -c "import gensim; print(gensim.__version__)"
```
## Usage example
```bash
# Train (or re‑train) with default config
python train.py
```
## License
The code is released under the **MIT License** (see `LICENSE`).
The model weights are provided under the same license unless otherwise specified.
## Citation
If you use GenSIM or the provided weights, please cite the following preprint and this repository:
```bibtex
@article{Finn_GenSIM_2025,
author={Finn, Tobias Sebastian and Bocquet, Marc and Rampal, Pierre and Durand, Charlotte and Porro, Flavia and Farchi, Alban and Carrassi, Alberto},
title={Generative AI models enable efficient and physically consistent sea-ice simulations},
url={http://arxiv.org/abs/2508.14984},
DOI={10.48550/arXiv.2508.14984},
note={arXiv:2508.14984 [physics]},
number={arXiv:2508.14984},
publisher={arXiv},
year={2025},
month=aug
}
```