File size: 4,996 Bytes
2d087d4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | ---
license: cc-by-4.0
tags:
- earth-observation
- remote-sensing
- diffusion
- generative
- copernicus
- sentinel
- major-tom
- multimodal
- latent-diffusion
library_name: cop-gen
datasets:
- Major-TOM/COP-GEN-Benchmark
---

# COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data
[](https://arxiv.org/abs/2603.03239)
[](https://github.com/miquel-espinosa/COP-GEN)
[](https://miquel-espinosa.github.io/cop-gen/)
[](https://huggingface.co/collections/mespinosami/copgen)
COP-GEN is a generative foundation model for Copernicus Earth observation data. It learns a joint distribution over all major Copernicus modalities — Sentinel-1 SAR, Sentinel-2 multispectral (L1C and L2A), DEM, and LULC — enabling both unconditional generation and cross-modal conditional synthesis (e.g. generate S2 RGB from S1 SAR, or generate all modalities jointly).
## Model Details
- **Developed by:** Miguel Espinosa, Eva Gmelich Meijling, Valerio Marsocci, Elliot J. Crowley, Mikolaj Czerkawski
- **Model type:** Latent Diffusion Transformer (multimodal, multi-resolution)
- **Modalities:** S1RTC (VV, VH), S2L1C (all bands + cloud mask), S2L2A (all bands), DEM, LULC, timestamps, lat-lon
- **License:** CC-BY-4.0
- **Paper:** [arXiv:2603.03239](https://arxiv.org/abs/2603.03239)
- **Repository:** [github.com/miquel-espinosa/COP-GEN](https://github.com/miquel-espinosa/COP-GEN)
### Architecture
COP-GEN operates in a shared latent space produced by a set of modality-specific KL-regularised VAEs. The diffusion backbone is a transformer trained jointly over all modalities, supporting arbitrary conditioning at inference time — any subset of modalities can be held as conditions while the rest are generated.
## Uses
### Direct Use
Generate synthetic Copernicus EO scenes, either unconditionally or conditioned on one or more input modalities. Useful for data augmentation, gap-filling missing modalities, and studying cross-sensor relationships.
### Downstream Use
The latent representations and generated samples can serve as inputs to downstream EO tasks: land cover classification, change detection, cloud removal, SAR-to-optical translation, and more.
## How to Get Started
```python
from libs.copgen import CopgenModel
model = CopgenModel(
model_path="path/to/model_checkpoint.pth",
config_path="path/to/model_config.py"
)
# Conditional generation: provide one or more modalities as conditions
samples = model.generate(
modalities=["S2L2A_B02_B03_B04_B08", "S1RTC_vh_vv"],
conditions={"S1RTC_vh_vv": s1_tensor},
n_samples=4,
)
# Unconditional generation
samples = model.generate(
modalities=["S2L2A_B02_B03_B04_B08", "S1RTC_vh_vv"],
n_samples=4,
)
```
See [examples/conditional_generation.py](https://github.com/miquel-espinosa/COP-GEN/blob/main/examples/conditional_generation.py) and [examples/unconditional_generation.py](https://github.com/miquel-espinosa/COP-GEN/blob/main/examples/unconditional_generation.py) for full worked examples.
## Training Details
### Training Data
Trained on [Major-TOM](https://huggingface.co/Major-TOM) global Copernicus data, covering Sentinel-1 RTC, Sentinel-2 L1C and L2A, DEM, and LULC. A pre-compiled Edinburgh subset is available at [mespinosami/copgen-edinburgh-subset](https://huggingface.co/datasets/mespinosami/copgen-edinburgh-subset) for local development and reproduction.
### Training Procedure
1. Modality-specific KL-VAEs are trained separately per modality and resolution.
2. All modalities are encoded into a shared latent space.
3. A diffusion transformer backbone is trained jointly over the merged latents, with random masking of modalities to enable conditional generation at inference.
See the [GitHub README](https://github.com/miquel-espinosa/COP-GEN) for full training instructions.
## Evaluation
Evaluated on the [COP-GEN-Benchmark](https://huggingface.co/datasets/Major-TOM/COP-GEN-Benchmark) test set (495 held-out global scenes). Distribution-level metrics (FID and related) are reported in Table 1 of the paper. To reproduce:
```bash
pip install -r benchmark/stochastic/requirements.txt
python -m benchmark.stochastic.run --output metrics.csv
```
## Citation
```bibtex
@article{copgen2026,
title = {COP-GEN: Latent Diffusion Transformer for Copernicus Earth
Observation Data},
author = {Espinosa, Miguel and Gmelich Meijling, Eva and Marsocci,
Valerio and Crowley, Elliot J. and Czerkawski, Mikolaj},
year = {2026},
journal = {arXiv preprint arXiv:2603.03239},
url = {https://arxiv.org/abs/2603.03239},
}
``` |