Update README.md

a7631c9 verified 3 months ago

3.76 kB

	---
	license: apache-2.0
	paper: https://arxiv.org/abs/2510.12670
	homepage: https://github.com/IBM/TerraCodec
	---

	# TerraCodec TT – Sentinel-2 L1C

	Neural Compression for Earth Observation

	[![arXiv](https://img.shields.io/badge/arXiv-2510.12670-b31b1b)](https://arxiv.org/abs/2510.12670)
	[![GitHub](https://img.shields.io/badge/GitHub-IBM%2FTerraCodec-black?logo=github)](https://github.com/IBM/TerraCodec)
	[![PyPI](https://img.shields.io/badge/PyPI-terracodec-blue?logo=pypi)](https://pypi.org/project/terracodec/)

	TerraCodec (TEC) is a family of pretrained neural compression codecs for multispectral Sentinel-2 satellite imagery. The models compress optical Earth observation data using learned latent representations and entropy coding.

	This repository provides Temporal Transformer (TEC-TT) models trained on Sentinel-2 L1C imagery. The main TerraCodec models are released for Sentinel-2 L2A data, the L1C variants were used for declouding experiments in the paper.

	![Reconstructions](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/reconstructions.jpg)
	---

	# Model Architecture

	This repository contains the TEC-TT (Temporal Transformer) variants of TerraCodec for S2L1C data.

	![assets/TEC_TT_architecture.png](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/TEC_TT_architecture.png)

	TEC-TT extends the TerraCodec image codecs by modeling temporal dependencies across satellite image sequences. Each frame is first encoded using an ELIC-style CNN encoder–decoder to obtain latent representations. A temporal transformer then predicts the probability distribution of the current frame’s latents conditioned on previously encoded frames.

	By exploiting redundancy across seasonal observations, TEC-TT achieves improved compression efficiency for multi-temporal satellite imagery.

	See the [paper](https://arxiv.org/abs/2510.12670) for additional architectural and training details.

	---

	# Input Format

	\| Codec type \| Expected shape \| Example \|
	\|------------\|----------------\|---------\|
	\| Temporal codecs \| `[B, T, C, H, W]` \| `[1, 4, 13, 256, 256]` \|

	- Inputs use 13 Sentinel-2 L1C spectral bands
	- Recommended spatial size: 256 × 256
	- Models were trained on four seasonal frames, but can process any number of timesteps during inference (higher T increases compute)

	---

	# Normalization

	Models were trained on Sentinel-2 L1C imagery.

	Inputs should be standardized per spectral band using dataset statistics:

	```python
	mean = torch.tensor([1607.345, 1393.068, 1320.225, 1373.963, 1562.536, 2110.071, 2392.832, 2321.154, 2583.77, 838.712, 21.753, 2205.112, 1545.798])

	std = torch.tensor([786.523, 849.702, 875.318, 1143.578, 1126.248, 1161.98, 1273.505, 1246.79, 1342.755, 576.795, 45.626, 1340.347, 1145.036])
	```

	---

	# Usage

	Install TerraCodec:

	```
	pip install terracodec
	```

	Load pretrained models:
	```python
	from terracodec import terracodec_v1_tt_s2l1c

	model = terracodec_v1_tt_s2l1c(
	pretrained=True,
	compression=5
	)

	# Fast reconstruction (no bitstream)
	reconstruction = model(inputs)

	# True compression
	compressed = model.compress(inputs)
	reconstruction = model.decompress(**compressed)
	```

	# Feedback
	If you have questions, encounter issues or want to discuss improvements:
	- open an issue or discussion on GitHub
	- or contribute directly to the repository

	GitHub repository: https://github.com/IBM/TerraCodec

	# Citation
	If you use TerraCodec in your research, please cite:
	```
	@article{terracodec2025,
	title = {TerraCodec: Compressing Optical Earth Observation Data},
	author = {Costa Watanabe, Julen and Wittmann, Isabelle and Blumenstiel, Benedikt and Schindler, Konrad},
	journal = {arXiv preprint arXiv:2510.12670},
	year = {2025}
	}
	```