Add pipeline tag and improve metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +6 -27
README.md CHANGED
@@ -1,5 +1,9 @@
1
  ---
 
 
 
2
  license: cc-by-4.0
 
3
  tags:
4
  - earth-observation
5
  - remote-sensing
@@ -7,9 +11,6 @@ tags:
7
  - copernicus
8
  - sentinel
9
  - multimodal
10
- library_name: cop-gen
11
- datasets:
12
- - Major-TOM/COP-GEN-Benchmark
13
  ---
14
 
15
  ![copgen-banner-github](https://cdn-uploads.huggingface.co/production/uploads/63ea69a55c837d9968ebecc0/JHy5rYg3WF3y4T01ik2IB.png)
@@ -21,7 +22,7 @@ datasets:
21
  [![Website](https://img.shields.io/badge/๐ŸŒ-Website-grey)](https://miquel-espinosa.github.io/cop-gen/)
22
  [![HF Collection](https://img.shields.io/badge/๐Ÿค—-Collection-yellow)](https://huggingface.co/collections/mespinosami/copgen)
23
 
24
- This repository contains the suite of modality-specific KL-regularised VAEs used in COP-GEN. Each VAE encodes a distinct Copernicus modality (or band group) into a shared latent space at 8 latent channels. These are prerequisites for both COP-GEN inference and training โ€” the diffusion backbone operates on the latents produced by these encoders.
25
 
26
  ## Model Details
27
 
@@ -50,14 +51,7 @@ This repository contains the suite of modality-specific KL-regularised VAEs used
50
 
51
  ## How to Get Started
52
 
53
- Download all VAEs into the expected directory:
54
-
55
- ```bash
56
- git clone https://huggingface.co/mespinosami/copgen-vaes ./models/vae
57
- rm -rf ./models/vae/.git ./models/vae/.gitattributes
58
- ```
59
-
60
- Each VAE checkpoint is stored as `model-<step>-ema.pt` alongside its config file. The EMA weights have already been extracted into the correct format for inference. To use a VAE directly:
61
 
62
  ```python
63
  from libs.vae import load_vae
@@ -71,25 +65,10 @@ latents = vae.encode(image_tensor) # (B, 8, H/f, W/f)
71
  recon = vae.decode(latents)
72
  ```
73
 
74
- See the [GitHub README](https://github.com/miquel-espinosa/COP-GEN) for full encoding instructions for each modality.
75
-
76
  ## Training Details
77
 
78
  Each VAE is trained independently on its respective modality. Inputs are normalised to [-1, 1] using precomputed per-modality min-max statistics (included in the config files). Sentinel-2 data uses a fixed scale factor of 1/1000. Training uses the `accelerate` launcher and supports single- and multi-GPU setups.
79
 
80
- ```bash
81
- # Example: train the S2L2A RGB+NIR VAE
82
- accelerate launch --num_processes 1 train_vae.py \
83
- --cfg configs/vae/final/S2L2A/copgen_ae_kl_192x192_S2L2A_B4_3_2_8_latent_8.yaml \
84
- --data_dir ./data/majorTOM/edinburgh/Core-S2L2A
85
- ```
86
-
87
- After training, extract the EMA weights before use with COP-GEN:
88
-
89
- ```bash
90
- python3 scripts/extract_ema_convert_model.py models/vae/<modality>/<config>/model-*.pt
91
- ```
92
-
93
  ## Relationship to COP-GEN
94
 
95
  These VAEs are used in two ways:
 
1
  ---
2
+ datasets:
3
+ - Major-TOM/COP-GEN-Benchmark
4
+ library_name: cop-gen
5
  license: cc-by-4.0
6
+ pipeline_tag: image-to-image
7
  tags:
8
  - earth-observation
9
  - remote-sensing
 
11
  - copernicus
12
  - sentinel
13
  - multimodal
 
 
 
14
  ---
15
 
16
  ![copgen-banner-github](https://cdn-uploads.huggingface.co/production/uploads/63ea69a55c837d9968ebecc0/JHy5rYg3WF3y4T01ik2IB.png)
 
22
  [![Website](https://img.shields.io/badge/๐ŸŒ-Website-grey)](https://miquel-espinosa.github.io/cop-gen/)
23
  [![HF Collection](https://img.shields.io/badge/๐Ÿค—-Collection-yellow)](https://huggingface.co/collections/mespinosami/copgen)
24
 
25
+ This repository contains the suite of modality-specific KL-regularised VAEs used in [COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data](https://arxiv.org/abs/2603.03239). Each VAE encodes a distinct Copernicus modality (or band group) into a shared latent space at 8 latent channels. These are prerequisites for both COP-GEN inference and training โ€” the diffusion backbone operates on the latents produced by these encoders.
26
 
27
  ## Model Details
28
 
 
51
 
52
  ## How to Get Started
53
 
54
+ To use a VAE directly for encoding or decoding, you can use the loading logic from the official repository:
 
 
 
 
 
 
 
55
 
56
  ```python
57
  from libs.vae import load_vae
 
65
  recon = vae.decode(latents)
66
  ```
67
 
 
 
68
  ## Training Details
69
 
70
  Each VAE is trained independently on its respective modality. Inputs are normalised to [-1, 1] using precomputed per-modality min-max statistics (included in the config files). Sentinel-2 data uses a fixed scale factor of 1/1000. Training uses the `accelerate` launcher and supports single- and multi-GPU setups.
71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  ## Relationship to COP-GEN
73
 
74
  These VAEs are used in two ways: