Dual Validation Framework for Analysis-Ready Satellite Data

This repository contains the curated multitemporal dataset and the trained model weights for the paper: "A Dual Validation Framework for Curating Machine Learning-Ready Satellite Datasets: A Scalable Pipeline and Stratified Analysis."

For the source code, data curation pipeline, and inference scripts, please visit our GitHub Repository.

🌍 Overview

The convergence of petabyte-scale satellite archives and foundation models requires rigorous validation frameworks. This repository hosts data and models used to demonstrate a novel Dual Validation Framework.

Our framework introduces a composite **Difficulty Index (DI)**—synthesizing spatial heterogeneity, phenological variability, and cloud persistence—to stratify model performance beyond standard aggregate metrics. The dataset focuses on the cloud-gap imputation task using multitemporal Earth observation data.

📁 Repository Structure

1. Dataset (`/dataset`)

The dataset is provided in a cloud-optimized Zarr format, ensuring high-throughput parallel access suitable for distributed deep learning.

Source: MOD09GA (MODIS/Terra Surface Reflectance Daily, Level-2G, Collection 6.1)
Spatial Resolution: 500m
Temporal Window: June 14 – July 3, 2021 (20 consecutive days)
Region of Interest: Central Europe (lon 0°–20°E, lat 40°–60°N)
Ancillary Data: ESA WorldCover 2021 (10m, for spatial heterogeneity extraction)
Structure: [Time, Bands, Height, Width] tensor representing surface reflectance and multi-label usability masks.

2. Model Weights (`/experiments_unified`)

This repository includes the .pth PyTorch weights for the models evaluated in the study:

3D U-Net (unet_3d_best.pth): Task-specific convolutional architecture trained natively on the 500m MODIS data. Demonstrates superior structural sample efficiency (highest SSIM and lowest RMSE).
Prithvi Foundation Model (prithvi_finetuned.pth & prithvi_frozen.pth): Weights for the fully fine-tuned and frozen variants of the Prithvi-EO-2.0 Vision Transformer. Demonstrates robust priors for spectral fidelity (lowest SAM).

🚀 Usage

The weights and Zarr datasets hosted here are designed to be used in conjunction with the data loaders and evaluation engines provided in our GitHub repository.

Example usage for downloading and loading the dataset/models can be found in the GitHub README.

📊 Key Findings

Stratified Performance: Aggregate metrics obscure severe architectural failure modes. Deep learning architectures experience significant degradation in structurally coherent reconstruction when evaluated in the High-Difficulty stratum of our DI.
Architectural Tradeoffs: The 3D U-Net excels in structural preservation (SSIM) by leveraging local spatial convolutions, whereas the Prithvi foundation model excels in radiometric consistency (SAM) across domain gaps.
Evaluation Artifacts: The study highlights the mathematical instability of applying global structural metrics (like SSIM) to highly structured phenological regions, strongly advocating for masked pixel-wise evaluations in local imputation tasks.

📝 Citation

If you utilize this dataset, the model weights, or the dual validation methodology in your research, please cite our paper:

@article{tadie2026dualvalidation,
  title={A Dual Validation Framework for Curating Machine Learning-Ready
Satellite Datasets: A Scalable Pipeline and Stratified Analysis},
  author={Tadie B. Medimem and Farid Melgani and Sandro Luigi Fiore and Valentine G. Anantharaj},
  journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
  year={2026},
  publisher={IEEE}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support