File size: 3,762 Bytes

---
license: apache-2.0
paper: https://arxiv.org/abs/2510.12670
homepage: https://github.com/IBM/TerraCodec
---

# TerraCodec TT – Sentinel-2 L1C

**Neural Compression for Earth Observation**

[![arXiv](https://img.shields.io/badge/arXiv-2510.12670-b31b1b)](https://arxiv.org/abs/2510.12670)
[![GitHub](https://img.shields.io/badge/GitHub-IBM%2FTerraCodec-black?logo=github)](https://github.com/IBM/TerraCodec)
[![PyPI](https://img.shields.io/badge/PyPI-terracodec-blue?logo=pypi)](https://pypi.org/project/terracodec/)

TerraCodec (TEC) is a family of pretrained neural compression codecs for **multispectral Sentinel-2 satellite imagery**. The models compress optical Earth observation data using learned latent representations and entropy coding.

This repository provides **Temporal Transformer (TEC-TT) models trained on Sentinel-2 L1C imagery**. The main TerraCodec models are released for Sentinel-2 L2A data, the L1C variants were used for declouding experiments in the paper.

![Reconstructions](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/reconstructions.jpg)
---

# Model Architecture

This repository contains the **TEC-TT (Temporal Transformer)** variants of TerraCodec for **S2L1C data**.

![assets/TEC_TT_architecture.png](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/TEC_TT_architecture.png)

TEC-TT extends the TerraCodec image codecs by modeling temporal dependencies across satellite image sequences. Each frame is first encoded using an ELIC-style CNN encoder–decoder to obtain latent representations. A temporal transformer then predicts the probability distribution of the current frame’s latents conditioned on previously encoded frames.

By exploiting redundancy across seasonal observations, TEC-TT achieves improved compression efficiency for multi-temporal satellite imagery.

See the [paper](https://arxiv.org/abs/2510.12670) for additional architectural and training details.

---

# Input Format

| Codec type | Expected shape | Example |
|------------|----------------|---------|
| Temporal codecs | `[B, T, C, H, W]` | `[1, 4, 13, 256, 256]` |

- Inputs use **13 Sentinel-2 L1C spectral bands**
- Recommended spatial size: **256 × 256**
- Models were trained on four seasonal frames, but can process any number of timesteps during inference (higher T increases compute)

---

# Normalization

Models were trained on **Sentinel-2 L1C imagery**.

Inputs should be standardized per spectral band using dataset statistics:

```python
mean = torch.tensor([1607.345, 1393.068, 1320.225, 1373.963, 1562.536, 2110.071, 2392.832, 2321.154, 2583.77, 838.712, 21.753, 2205.112, 1545.798])

std = torch.tensor([786.523, 849.702, 875.318, 1143.578, 1126.248, 1161.98, 1273.505, 1246.79, 1342.755, 576.795, 45.626, 1340.347, 1145.036])
```

---

# Usage

Install TerraCodec:

```
pip install terracodec
```

Load pretrained models:
```python
from terracodec import terracodec_v1_tt_s2l1c

model = terracodec_v1_tt_s2l1c(
    pretrained=True,
    compression=5
)

# Fast reconstruction (no bitstream)
reconstruction = model(inputs)

# True compression
compressed = model.compress(inputs)
reconstruction = model.decompress(**compressed)
```

# Feedback
If you have questions, encounter issues or want to discuss improvements:
- open an issue or discussion on GitHub
- or contribute directly to the repository

GitHub repository: https://github.com/IBM/TerraCodec

# Citation
If you use TerraCodec in your research, please cite:
```
@article{terracodec2025,
  title   = {TerraCodec: Compressing Optical Earth Observation Data},
  author  = {Costa Watanabe, Julen and Wittmann, Isabelle and Blumenstiel, Benedikt and Schindler, Konrad},
  journal = {arXiv preprint arXiv:2510.12670},
  year    = {2025}
}
```