| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - remote-sensing |
| - earth-observation |
| - cloud-imputation |
| - modis |
| - prithvi |
| - 3d-unet |
| - difficulty-index |
| datasets: |
| - mod09ga |
| - esa-worldcover-2021 |
| --- |
| |
| # Dual Validation Framework for Analysis-Ready Satellite Data |
|
|
| This repository contains the curated multitemporal dataset and the trained model weights for the paper: **"A Dual Validation Framework for Analysis-Ready Satellite Data: A Scalable Pipeline and Stratified Performance Analysis."** |
|
|
| For the source code, data curation pipeline, and inference scripts, please visit our [GitHub Repository](https://github.com/TadieB/dual_validation_framework). |
|
|
| ## 🌍 Overview |
| The convergence of petabyte-scale satellite archives and foundation models requires rigorous validation frameworks. This repository hosts data and models used to demonstrate a novel **Dual Validation Framework**. |
|
|
| Our framework introduces a composite **Difficulty Index (DI)**—synthesizing spatial heterogeneity, phenological variability, and cloud persistence—to stratify model performance beyond standard aggregate metrics. The dataset focuses on the **cloud-gap imputation** task using multitemporal Earth observation data. |
|
|
| ## 📁 Repository Structure |
|
|
| ### 1. Dataset (`/dataset`) |
| The dataset is provided in a cloud-optimized **Zarr** format, ensuring high-throughput parallel access suitable for distributed deep learning. |
| * **Source:** MOD09GA (MODIS/Terra Surface Reflectance Daily, Level-2G, Collection 6.1) |
| * **Spatial Resolution:** 500m |
| * **Temporal Window:** June 14 – July 3, 2021 (20 consecutive days) |
| * **Region of Interest:** Central Europe (lon 0°–20°E, lat 40°–60°N) |
| * **Ancillary Data:** ESA WorldCover 2021 (10m, for spatial heterogeneity extraction) |
| * **Structure:** `[Time, Bands, Height, Width]` tensor representing surface reflectance and multi-label usability masks. |
|
|
| ### 2. Model Weights (`/experiments_unified`) |
| This repository includes the `.pth` PyTorch weights for the models evaluated in the study: |
| * **3D U-Net (`unet_3d_best.pth`):** Task-specific convolutional architecture trained natively on the 500m MODIS data. Demonstrates superior structural sample efficiency (highest SSIM and lowest RMSE). |
| * **Prithvi Foundation Model (`prithvi_finetuned.pth` & `prithvi_frozen.pth`):** Weights for the fully fine-tuned and frozen variants of the Prithvi-EO-2.0 Vision Transformer. Demonstrates robust priors for spectral fidelity (lowest SAM). |
| |
| ## 🚀 Usage |
| The weights and Zarr datasets hosted here are designed to be used in conjunction with the data loaders and evaluation engines provided in our GitHub repository. |
| |
| Example usage for downloading and loading the dataset/models can be found in the [GitHub README](https://github.com/TadieB/dual_validation_framework). |
| |
| ## 📊 Key Findings |
| * **Stratified Performance:** Aggregate metrics obscure severe architectural failure modes. Deep learning architectures experience significant degradation in structurally coherent reconstruction when evaluated in the High-Difficulty stratum of our DI. |
| * **Architectural Tradeoffs:** The 3D U-Net excels in structural preservation (SSIM) by leveraging local spatial convolutions, whereas the Prithvi foundation model excels in radiometric consistency (SAM) across domain gaps. |
| * **Evaluation Artifacts:** The study highlights the mathematical instability of applying global structural metrics (like SSIM) to highly structured phenological regions, strongly advocating for masked pixel-wise evaluations in local imputation tasks. |
| |
| ## 📝 Citation |
| If you utilize this dataset, the model weights, or the dual validation methodology in your research, please cite our paper: |
| |
| ```bibtex |
| @article{tadie2026dualvalidation, |
| title={A Dual Validation Framework for Analysis-Ready Satellite Data: A Scalable Pipeline and Stratified Performance Analysis}, |
| author={Tadie B. Medimem and Farid Melgani and Sandro Luigi Fiore and Valentine G. Anantharaj}, |
| journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing}, |
| year={2026}, |
| publisher={IEEE} |
| } |
| |