blumenstiel's picture
Update README.md
a7631c9 verified
---
license: apache-2.0
paper: https://arxiv.org/abs/2510.12670
homepage: https://github.com/IBM/TerraCodec
---
# TerraCodec TT – Sentinel-2 L1C
**Neural Compression for Earth Observation**
[![arXiv](https://img.shields.io/badge/arXiv-2510.12670-b31b1b)](https://arxiv.org/abs/2510.12670)
[![GitHub](https://img.shields.io/badge/GitHub-IBM%2FTerraCodec-black?logo=github)](https://github.com/IBM/TerraCodec)
[![PyPI](https://img.shields.io/badge/PyPI-terracodec-blue?logo=pypi)](https://pypi.org/project/terracodec/)
TerraCodec (TEC) is a family of pretrained neural compression codecs for **multispectral Sentinel-2 satellite imagery**. The models compress optical Earth observation data using learned latent representations and entropy coding.
This repository provides **Temporal Transformer (TEC-TT) models trained on Sentinel-2 L1C imagery**. The main TerraCodec models are released for Sentinel-2 L2A data, the L1C variants were used for declouding experiments in the paper.
![Reconstructions](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/reconstructions.jpg)
---
# Model Architecture
This repository contains the **TEC-TT (Temporal Transformer)** variants of TerraCodec for **S2L1C data**.
![assets/TEC_TT_architecture.png](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/TEC_TT_architecture.png)
TEC-TT extends the TerraCodec image codecs by modeling temporal dependencies across satellite image sequences. Each frame is first encoded using an ELIC-style CNN encoder–decoder to obtain latent representations. A temporal transformer then predicts the probability distribution of the current frame’s latents conditioned on previously encoded frames.
By exploiting redundancy across seasonal observations, TEC-TT achieves improved compression efficiency for multi-temporal satellite imagery.
See the [paper](https://arxiv.org/abs/2510.12670) for additional architectural and training details.
---
# Input Format
| Codec type | Expected shape | Example |
|------------|----------------|---------|
| Temporal codecs | `[B, T, C, H, W]` | `[1, 4, 13, 256, 256]` |
- Inputs use **13 Sentinel-2 L1C spectral bands**
- Recommended spatial size: **256 × 256**
- Models were trained on four seasonal frames, but can process any number of timesteps during inference (higher T increases compute)
---
# Normalization
Models were trained on **Sentinel-2 L1C imagery**.
Inputs should be standardized per spectral band using dataset statistics:
```python
mean = torch.tensor([1607.345, 1393.068, 1320.225, 1373.963, 1562.536, 2110.071, 2392.832, 2321.154, 2583.77, 838.712, 21.753, 2205.112, 1545.798])
std = torch.tensor([786.523, 849.702, 875.318, 1143.578, 1126.248, 1161.98, 1273.505, 1246.79, 1342.755, 576.795, 45.626, 1340.347, 1145.036])
```
---
# Usage
Install TerraCodec:
```
pip install terracodec
```
Load pretrained models:
```python
from terracodec import terracodec_v1_tt_s2l1c
model = terracodec_v1_tt_s2l1c(
pretrained=True,
compression=5
)
# Fast reconstruction (no bitstream)
reconstruction = model(inputs)
# True compression
compressed = model.compress(inputs)
reconstruction = model.decompress(**compressed)
```
# Feedback
If you have questions, encounter issues or want to discuss improvements:
- open an issue or discussion on GitHub
- or contribute directly to the repository
GitHub repository: https://github.com/IBM/TerraCodec
# Citation
If you use TerraCodec in your research, please cite:
```
@article{terracodec2025,
title = {TerraCodec: Compressing Optical Earth Observation Data},
author = {Costa Watanabe, Julen and Wittmann, Isabelle and Blumenstiel, Benedikt and Schindler, Konrad},
journal = {arXiv preprint arXiv:2510.12670},
year = {2025}
}
```