gensim / README.md

Update README.md

c8b3b6d verified 2 months ago

5.85 kB

	---
	license: mit
	language:
	- en
	tags:
	- geoscience
	- sea_ice
	- forecasting
	- generative
	---
	# GenSIM – Generative Sea‑Ice Model

	## Model description
	GenSIM is a generative AI‑based pan‑Arctic sea‑ice model that predicts the evolution of key sea‑ice state variables (concentration, thickness, damage, drift components, and snow‑on‑ice thickness) in a 12‑hour window.
	It leverages censored flow‑matching and a scale‑aware transformer architecture with domain decomposition, enabling fast, memory‑efficient forecasts that remain physically consistent (e.g., non‑negative thickness).

	- Model type: Flow‑matching transformer with auto-regressive forecasting steps
	- Input: Initial conditions for sea ice + atmospheric forcings (2 m temperature, specific humidity, 10 m wind components)
	- Output: Predicted sea‑ice state at t + 12 h (or further steps via autoregression)
	- Resolution: Curvilinear 1/4° mesh (~12 km)

	## Architecture
	Key components:

	\| Component \| Details \|
	\|-----------\|---------\|
	\| Embedding \| Linear patching (2×2), pseudo‑time, resolution, and augmentation embeddings \|
	\| Transformer \| 8 blocks of self‑attention with learnable localisation, followed by MLPs \|
	\| Domain decomposition \| Overlapping sub‑domains processed in parallel, linear scaling with grid size \|
	\| Censored flow‑matching \| Enforces physical bounds (e.g., non‑negative thickness) via censored Gaussian distributions \|

	## Training data
	- Source: 20 years of the neXtSIM‑OPA sea‑ice–ocean simulation (global Arctic)
	- Variables: Sea‑ice state (`sit`, `sic`, `sid`, `siu`, `siv`, `snt`) and atmospheric forcings (`t2m`, `q2m`, `u10`, `v10`)

	## Training procedure
	- Framework: PyTorch Lightning with Hydra configuration.
	- Optimizer: AdamW.
	- Training length: 1 000 000 steps.

	## Evaluation
	- Metrics: RMSE, physical consistency, multi‑decadal climate trend reproduction.
	- Benchmarks: Compared against deterministic baseline model with better skill in RMSE of the ensemble mean, representation of marginal ice zone and by matching energies at all scales.
	- Ensemble capability: Low memory footprint (< 4 GB) enables large ensembles and decadal simulations on a single GPU.

	## Intended uses & limitations
	Intended uses
	- Short‑term (12 h) sea‑ice forecasts for research and operational settings.
	- Generating ensembles for uncertainty quantification.
	- Climate‑scale analysis via auto‑regressive roll‑outs.

	Limitations
	- Model is trained on a specific climate regime (historical neXtSIM‑OPA); extrapolation to vastly different forcing scenarios may degrade performance.
	- Physical realism is limited to variables present in the training data; oceanic processes are inferred implicitly, not modeled explicitly.
	- Predictions are only as reliable as the atmospheric forcing inputs.

	## How to load the model
	The repository provides two checkpoint files in safetensors format:

	- `model_weights.safetensors` – non‑EMA (standard) weights
	- `model_weights_ema.safetensors` – Exponential Moving Average (EMA) weights

	Both can be loaded with the `FlowMatchingModel` class:

	```python
	import torch
	from safetensors import safe_open
	from gensim.network import Transformer

	# Initialise the model (use the same hyper‑parameters as in config.yaml)
	model = Transformer(
	n_input=..., # e.g., number of input channels (see config.yaml)
	n_output=..., # e.g., number of output channels
	n_features=..., # model dimension
	n_blocks=..., # transformer depth
	# any other kwargs required by the class
	)


	# Choose which checkpoint to load
	checkpoint_path = "model_weights_ema.safetensors" # or "model_weights.safetensors"

	# Load weights from savetensor
	with safe_open(checkpoint_path, framework="pt", device="cuda") as f:
	saved_keys = list(f.keys())

	# Get the network state dict
	network_state_dict = model.state_dict()

	# Update the network state dict with the weights from the tensor
	network_state_dict.update({key: f.get_tensor(key) for key in saved_keys})

	# Load the updated state dict into the network
	model.load_state_dict(network_state_dict)

	# Helpful message if keys are missing
	missing_keys = [k for key in network_state_dict.keys() if key not in saved_keys]
	if missing_keys:
	print("Missing keys in loaded weights:", missing_keys)


	# Set to eval mode
	model.eval()
	```

	> Note: The exact constructor arguments (`n_input`, `n_output`, `n_features`, `n_blocks`, …) can be found in `config.yaml`. Adjust them to match the checkpoint you load.

	## Installation
	```bash
	git clone https://github.com/cerea-daml/gensim.git
	cd gensim
	conda env create -f environment.yml
	conda activate gensim
	pip install -e .
	```

	Verify installation:
	```bash
	python -c "import gensim; print(gensim.__version__)"
	```

	## Usage example
	```bash
	# Train (or re‑train) with default config
	python train.py
	```

	## License
	The code is released under the MIT License (see `LICENSE`).
	The model weights are provided under the same license unless otherwise specified.

	## Citation
	If you use GenSIM or the provided weights, please cite the following preprint and this repository:

	```bibtex
	@article{Finn_GenSIM_2025,
	author={Finn, Tobias Sebastian and Bocquet, Marc and Rampal, Pierre and Durand, Charlotte and Porro, Flavia and Farchi, Alban and Carrassi, Alberto},
	title={Generative AI models enable efficient and physically consistent sea-ice simulations},
	url={http://arxiv.org/abs/2508.14984},
	DOI={10.48550/arXiv.2508.14984},
	note={arXiv:2508.14984 [physics]},
	number={arXiv:2508.14984},
	publisher={arXiv},
	year={2025},
	month=aug
	}
	```