--- license: apache-2.0 paper: https://arxiv.org/abs/2510.12670 homepage: https://github.com/IBM/TerraCodec --- # TerraCodec TT – Sentinel-2 L1C **Neural Compression for Earth Observation** [![arXiv](https://img.shields.io/badge/arXiv-2510.12670-b31b1b)](https://arxiv.org/abs/2510.12670) [![GitHub](https://img.shields.io/badge/GitHub-IBM%2FTerraCodec-black?logo=github)](https://github.com/IBM/TerraCodec) [![PyPI](https://img.shields.io/badge/PyPI-terracodec-blue?logo=pypi)](https://pypi.org/project/terracodec/) TerraCodec (TEC) is a family of pretrained neural compression codecs for **multispectral Sentinel-2 satellite imagery**. The models compress optical Earth observation data using learned latent representations and entropy coding. This repository provides **Temporal Transformer (TEC-TT) models trained on Sentinel-2 L1C imagery**. The main TerraCodec models are released for Sentinel-2 L2A data, the L1C variants were used for declouding experiments in the paper. ![Reconstructions](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/reconstructions.jpg) --- # Model Architecture This repository contains the **TEC-TT (Temporal Transformer)** variants of TerraCodec for **S2L1C data**. ![assets/TEC_TT_architecture.png](https://raw.githubusercontent.com/IBM/TerraCodec/main/assets/TEC_TT_architecture.png) TEC-TT extends the TerraCodec image codecs by modeling temporal dependencies across satellite image sequences. Each frame is first encoded using an ELIC-style CNN encoder–decoder to obtain latent representations. A temporal transformer then predicts the probability distribution of the current frame’s latents conditioned on previously encoded frames. By exploiting redundancy across seasonal observations, TEC-TT achieves improved compression efficiency for multi-temporal satellite imagery. See the [paper](https://arxiv.org/abs/2510.12670) for additional architectural and training details. --- # Input Format | Codec type | Expected shape | Example | |------------|----------------|---------| | Temporal codecs | `[B, T, C, H, W]` | `[1, 4, 13, 256, 256]` | - Inputs use **13 Sentinel-2 L1C spectral bands** - Recommended spatial size: **256 × 256** - Models were trained on four seasonal frames, but can process any number of timesteps during inference (higher T increases compute) --- # Normalization Models were trained on **Sentinel-2 L1C imagery**. Inputs should be standardized per spectral band using dataset statistics: ```python mean = torch.tensor([1607.345, 1393.068, 1320.225, 1373.963, 1562.536, 2110.071, 2392.832, 2321.154, 2583.77, 838.712, 21.753, 2205.112, 1545.798]) std = torch.tensor([786.523, 849.702, 875.318, 1143.578, 1126.248, 1161.98, 1273.505, 1246.79, 1342.755, 576.795, 45.626, 1340.347, 1145.036]) ``` --- # Usage Install TerraCodec: ``` pip install terracodec ``` Load pretrained models: ```python from terracodec import terracodec_v1_tt_s2l1c model = terracodec_v1_tt_s2l1c( pretrained=True, compression=5 ) # Fast reconstruction (no bitstream) reconstruction = model(inputs) # True compression compressed = model.compress(inputs) reconstruction = model.decompress(**compressed) ``` # Feedback If you have questions, encounter issues or want to discuss improvements: - open an issue or discussion on GitHub - or contribute directly to the repository GitHub repository: https://github.com/IBM/TerraCodec # Citation If you use TerraCodec in your research, please cite: ``` @article{terracodec2025, title = {TerraCodec: Compressing Optical Earth Observation Data}, author = {Costa Watanabe, Julen and Wittmann, Isabelle and Blumenstiel, Benedikt and Schindler, Konrad}, journal = {arXiv preprint arXiv:2510.12670}, year = {2025} } ```