| --- |
| license: cc-by-4.0 |
| tags: |
| - earth-observation |
| - remote-sensing |
| - diffusion |
| - generative |
| - copernicus |
| - sentinel |
| - major-tom |
| - multimodal |
| - latent-diffusion |
| library_name: cop-gen |
| datasets: |
| - Major-TOM/COP-GEN-Benchmark |
| --- |
| |
|  |
|
|
| # COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data |
|
|
| [](https://arxiv.org/abs/2603.03239) |
| [](https://github.com/miquel-espinosa/COP-GEN) |
| [](https://miquel-espinosa.github.io/cop-gen/) |
| [](https://huggingface.co/collections/mespinosami/copgen) |
|
|
| COP-GEN is a generative foundation model for Copernicus Earth observation data. It learns a joint distribution over all major Copernicus modalities β Sentinel-1 SAR, Sentinel-2 multispectral (L1C and L2A), DEM, and LULC β enabling both unconditional generation and cross-modal conditional synthesis (e.g. generate S2 RGB from S1 SAR, or generate all modalities jointly). |
|
|
| ## Model Details |
|
|
| - **Developed by:** Miguel Espinosa, Eva Gmelich Meijling, Valerio Marsocci, Elliot J. Crowley, Mikolaj Czerkawski |
| - **Model type:** Latent Diffusion Transformer (multimodal, multi-resolution) |
| - **Modalities:** S1RTC (VV, VH), S2L1C (all bands + cloud mask), S2L2A (all bands), DEM, LULC, timestamps, lat-lon |
| - **License:** CC-BY-4.0 |
| - **Paper:** [arXiv:2603.03239](https://arxiv.org/abs/2603.03239) |
| - **Repository:** [github.com/miquel-espinosa/COP-GEN](https://github.com/miquel-espinosa/COP-GEN) |
|
|
| ### Architecture |
|
|
| COP-GEN operates in a shared latent space produced by a set of modality-specific KL-regularised VAEs. The diffusion backbone is a transformer trained jointly over all modalities, supporting arbitrary conditioning at inference time β any subset of modalities can be held as conditions while the rest are generated. |
|
|
| ## Uses |
|
|
| ### Direct Use |
|
|
| Generate synthetic Copernicus EO scenes, either unconditionally or conditioned on one or more input modalities. Useful for data augmentation, gap-filling missing modalities, and studying cross-sensor relationships. |
|
|
| ### Downstream Use |
|
|
| The latent representations and generated samples can serve as inputs to downstream EO tasks: land cover classification, change detection, cloud removal, SAR-to-optical translation, and more. |
|
|
| ## How to Get Started |
|
|
| ```python |
| from libs.copgen import CopgenModel |
| |
| model = CopgenModel( |
| model_path="path/to/model_checkpoint.pth", |
| config_path="path/to/model_config.py" |
| ) |
| |
| # Conditional generation: provide one or more modalities as conditions |
| samples = model.generate( |
| modalities=["S2L2A_B02_B03_B04_B08", "S1RTC_vh_vv"], |
| conditions={"S1RTC_vh_vv": s1_tensor}, |
| n_samples=4, |
| ) |
| |
| # Unconditional generation |
| samples = model.generate( |
| modalities=["S2L2A_B02_B03_B04_B08", "S1RTC_vh_vv"], |
| n_samples=4, |
| ) |
| ``` |
|
|
| See [examples/conditional_generation.py](https://github.com/miquel-espinosa/COP-GEN/blob/main/examples/conditional_generation.py) and [examples/unconditional_generation.py](https://github.com/miquel-espinosa/COP-GEN/blob/main/examples/unconditional_generation.py) for full worked examples. |
|
|
| ## Training Details |
|
|
| ### Training Data |
|
|
| Trained on [Major-TOM](https://huggingface.co/Major-TOM) global Copernicus data, covering Sentinel-1 RTC, Sentinel-2 L1C and L2A, DEM, and LULC. A pre-compiled Edinburgh subset is available at [mespinosami/copgen-edinburgh-subset](https://huggingface.co/datasets/mespinosami/copgen-edinburgh-subset) for local development and reproduction. |
|
|
| ### Training Procedure |
|
|
| 1. Modality-specific KL-VAEs are trained separately per modality and resolution. |
| 2. All modalities are encoded into a shared latent space. |
| 3. A diffusion transformer backbone is trained jointly over the merged latents, with random masking of modalities to enable conditional generation at inference. |
|
|
| See the [GitHub README](https://github.com/miquel-espinosa/COP-GEN) for full training instructions. |
|
|
| ## Evaluation |
|
|
| Evaluated on the [COP-GEN-Benchmark](https://huggingface.co/datasets/Major-TOM/COP-GEN-Benchmark) test set (495 held-out global scenes). Distribution-level metrics (FID and related) are reported in Table 1 of the paper. To reproduce: |
|
|
| ```bash |
| pip install -r benchmark/stochastic/requirements.txt |
| python -m benchmark.stochastic.run --output metrics.csv |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{copgen2026, |
| title = {COP-GEN: Latent Diffusion Transformer for Copernicus Earth |
| Observation Data}, |
| author = {Espinosa, Miguel and Gmelich Meijling, Eva and Marsocci, |
| Valerio and Crowley, Elliot J. and Czerkawski, Mikolaj}, |
| year = {2026}, |
| journal = {arXiv preprint arXiv:2603.03239}, |
| url = {https://arxiv.org/abs/2603.03239}, |
| } |
| ``` |