Spaces:
Running
Running
| license: apache-2.0 | |
| language: | |
| - en | |
| pipeline_tag: emulation | |
| tags: | |
| - emulation | |
| - atmosphere radiative transfer models | |
| - hyperspectral | |
| pretty_name: Atmospheric Radiative Transfer Emulation Challenge | |
| title: rtm_emulation | |
| emoji: 🤖 | |
| colorFrom: gray | |
| colorTo: green | |
| sdk: static | |
| sdk_version: "latest" | |
| pinned: false | |
| Last update: 30-06-2025 | |
| <img src="https://elias-ai.eu/wp-content/uploads/2023/09/elias_logo_big-1.png" alt="elias_logo" style="width:15%; display: inline-block; margin-right: 150px;"> | |
| <img src="https://elias-ai.eu/wp-content/uploads/2024/01/EN_FundedbytheEU_RGB_WHITE-Outline-1.png" alt="eu_logo" style="width:20%; display: inline-block;"> | |
| # **Atmospheric Radiative Transfer Emulation Challenge** | |
| 1. [**Introduction**](#introduction) | |
| 2. [**Challenge Tasks and Data**](#challenge-tasks-and-data): | |
| 2.1. [**Proposed Experiments**](#proposed-experiments) | |
| 2.2. [**Data Availability and Format**](#data-availability-and-format) | |
| 3. [**Evaluation methodology**](#evaluation-methodology) | |
| 3.1. [**Prediction Accuracy**](#prediction-accuracy) | |
| 3.2. [**Computational efficiency**](#computational-efficiency) | |
| 3.3. [**Proposed Protocol**](#proposed-protocol) | |
| 4. [**Expected Outcomes**](#expected-outcomes) | |
| ## **Benchmark Results** | |
| | **Model** | **MRE A1 (%)** | **MRE A2 (%)** | **MRE B1 (%)** | **MRE B2 (%)** | **Score** | **Runtime** | **Rank** | | |
| |-----------|---------------|---------------|---------------|---------------|----------|----------|--------| | |
| | Jasdeep_Emulator_3 | 0.090 | 3.117 | 0.566 | 6.108 | 1.525 | 89.359 | 1° | | |
| | Hugo2 | 0.144 | 2.868 | 0.610 | 5.033 | 2.300 | 5.382 | 2° | | |
| | rpnn1 | 0.133 | 5.883 | 0.583 | 5.561 | 2.525 | 19.082 | 3° | | |
| | rpgprv2 | 0.176 | 3.835 | 0.640 | 7.050 | 4.000 | 35.650 | 4° | | |
| | Jasdeep_Emulator_2 | 0.886 | 3.895 | 0.768 | 6.176 | 5.625 | 2.078 | 5° | | |
| | Krtek | 0.545 | 7.693 | 0.823 | 7.877 | 6.500 | 0.764 | 6° | | |
| | rpcvae | 0.185 | 11.996 | 0.918 | 15.313 | 6.700 | 0.546 | 7° | | |
| | Jobaman1 | 0.296 | 10.093 | | 23.258 | 7.675 | 6.150 | 8° | | |
| | baseline | 0.998 | 12.604 | 1.084 | 7.072 | 8.150 | 0.241 | 9° | | |
| ## **Introduction** | |
| Atmospheric Radiative Transfer Models (RTM) are crucial in Earth and climate sciences with applications such as synthetic scene generation, satellite data processing, or | |
| numerical weather forecasting. However, their increasing complexity results in a computational burden that limits direct use in operational settings. A practical solution | |
| is to interpolate look-up-tables (LUTs) of pre-computed RTM simulations generated from long and costly model runs. However, large LUTs are still needed to achieve accurate | |
| results, requiring significant time to generate and demanding high memory capacity. Alternative, ad hoc solutions make data processing algorithms mission-specific and | |
| lack generalization. These problems are exacerbated for hyperspectral satellite missions, where the data volume of LUTs can increase by one or two orders of magnitude, | |
| limiting the applicability of advanced data processing algorithms. In this context, emulation offers an alternative, allowing for real-time satellite data processing | |
| algorithms while providing high prediction accuracy and adaptability across atmospheric conditions. Emulation replicate the behavior of a deterministic and computationally | |
| demanding model using statistical regression algorithms. This approach facilitates the implementation of physics-based inversion algorithms, yielding accurate and | |
| computationally efficient model predictions compared to traditional look-up table interpolation methods. | |
| RTM emulation is challenging due to the high-dimensional nature of both input (~10 dimensions) and output (several thousand) spaces, and the complex interactions of | |
| electromagnetic radiation with the atmosphere. The research implications are vast, with potential breakthroughs in surrogate modeling, uncertainty quantification, | |
| and physics-aware AI systems that can significantly contribute to climate and Earth observation sciences. | |
| This challenge will contribute to reducing computational burdens in climate and atmospheric research, enabling (1) Faster satellite data processing for applications in | |
| remote sensing and weather prediction, (2) improved accuracy in atmospheric correction of hyperspectral imaging data, and (3) more efficient climate simulations, allowing | |
| broader exploration of emission pathways aligned with sustainability goals. | |
| ## **Challenge Tasks and Data** | |
| Participants in this challenge will develop emulators trained on provided datasets to predict spectral magnitudes (atmospheric transmittances and reflectances) | |
| based on input atmospheric and geometric conditions. The challenge is structured around three main tasks: (1) training ML models | |
| using predefined datasets, (2) predicting outputs for given test conditions, and (3) evaluating emulator performance based on accuracy. | |
| ### **Proposed Experiments** | |
| The challenge includes two primary application test scenarios: | |
| 1. **Atmospheric Correction** (`A`): This scenario focuses on the atmospheric correction of hyperspectral satellite imaging data. Emulators will be tested on | |
| their ability to reproduce key atmospheric transfer functions that influence radiance measurements. This includes path radiance, direct/diffuse solar irradiance, and | |
| transmittance properties. Full spectral range simulations (400-2500 nm) will be provided at a resolution of 5cm<sup>-1</sup>. | |
| 2. **CO<sub>2</sub> Column Retrieval** (`B`): This scenario is in the context of atmospheric CO<sub>2</sub> retrieval by modeling how radiation interacts with various gas | |
| layers. The emulators will be evaluated on their accuracy in predicting top-of-atmosphere radiance, particularly within the spectral range sensitive to CO<sub>2</sub> | |
| absorption (2000-2100 nm) at high spectral resolution (0.1cm<sup>-1</sup>). | |
| For both scenarios, two test datasets (tracks) will be provided to evaluate 1) interpolation, and 2) extrapolation. | |
| Each scenario-track combination will be identified using alphanumeric ID `Sn`, where `S`={`A`,`B`} denotes to the scenario, and `n`={1,2} | |
| represents test dataset type (i.e., track). For example, `A2` refers to prediction for the atmospheric correction scenario using the the extrapolation dataset. | |
| Participants may choose their preferred scenario(s) and tracks; however, we encourage submitting predictions for all test conditions. | |
| ### **Data Availability and Format** | |
| Participants will have access to multiple training datasets of atmospheric RTM simulations varying in sample sizes, input parameters, and spectral range/resolution. | |
| These datasets will be generated using Latin Hypercube Sampling to ensure a comprehensive input space coverage and minimize issues related to ill-posedness and | |
| unrealistic results. | |
| The training data (i.e., inputs and outputs of RTM simulations) will be stored in [HDF5](https://docs.h5py.org/en/stable/) format with the following structure: | |
| | **Dimensions** | | | |
| |:---:|:---:| | |
| | **Name** | **Description** | | |
| | `n_wl` | Number of wavelengths for which spectral data is provided | | |
| | `n_funcs` | Number of atmospheric transfer functions | | |
| | `n_comb` | Number of data points at which spectral data is provided | | |
| | `n_param` | Dimensionality of the input variable space | | |
| | **Data Components** | | | | | |
| |:---:|:---:|:---:|:---:| | |
| | **Name** | **Description** | **Dimensions** | **Datatype** | | |
| | **`LUTdata`** | Atmospheric transfer functions (i.e. outputs) | `n_funcs*n_wvl x n_comb` | single | | |
| | **`LUTHeader`** | Matrix of input variable values for each combination (i.e., inputs) | `n_param x n_comb` | double | | |
| | **`wvl`** | Wavelength values associated with the atmospheric transfer functions (i.e., spectral grid) | `n_wvl` | double | | |
| **Note:** Participants may choose to predict the spectral data either as a single vector of length `n_funcs*n_wvl` or as `n_funcs` separate vectors of lenght `n_wvl`. | |
| Testing input datasets (i.e., input for predictions) will be stored in a tabulated `.csv` format with dimensions `n_param x n_comb`. | |
| The trainng and testing dataset will be organized organized into scenario-specific folders (see | |
| [**Proposed experiments**](/datasets/isp-uv-es/rtm_emulation#proposed-experiments)): `scenarioA` (Atmospheric Correction), and `scenarioB` (CO<sub>2</sub> Column Retrieval). | |
| Each folder will contain: | |
| - A `train` with multiple `.h5` files corresponding to different training sample sizes (e.g. `train2000.h5`contains 2000 samples). | |
| - A `reference` subfolder containg two test files (`refInterp` and `refExtrap`) referring to the two aforementioned tracks (i.e., interpolation and extrapolation). | |
| Here is an example of how to load each dataset in python: | |
| ```{python} | |
| import h5py | |
| import pandas as pd | |
| import numpy as np | |
| # Replace with the actual path to your training and testing data | |
| trainFile = 'train2000.h5' | |
| testFile = 'refInterp.csv' | |
| # Open the H5 file | |
| with h5py.File(file_path, 'r') as h5_file | |
| Ytrain = h5_file['LUTdata'][:] | |
| Xtrain = h5_file['LUTHeader'][:] | |
| wvl = h5_file['wvl'][:] | |
| # Read testing data | |
| df = pd.read_csv(testFile) | |
| Xtest = df.to_numpy() | |
| ``` | |
| in Matlab: | |
| ```{matlab} | |
| # Replace with the actual path to your training and testing data | |
| trainFile = 'train2000.h5'; | |
| testFile = 'refInterp.csv'; | |
| # Open the H5 file | |
| Ytrain = h5read(trainFile,'/LUTdata'); | |
| Xtrain = h5read(trainFile,'/LUTheader'); | |
| wvl = h5read(trainFile,'/wvl'); | |
| # Read testing data | |
| Xtest = importdata(testFile); | |
| ``` | |
| and in R language: | |
| ```{r} | |
| library(rhdf5) | |
| # Replace with the actual path to your training and testing data | |
| trainFile <- "train2000.h5" | |
| testFile <- "refInterp.csv" | |
| # Open the H5 file | |
| lut_data <- h5read(file_path, "LUTdata") | |
| lut_header <- h5read(file_path, "LUTHeader") | |
| wavelengths <- h5read(file_path, "wvl") | |
| # Read testing data | |
| Xtest <- as.matrix(read.table(file_path, sep = ",", header = TRUE)) | |
| ``` | |
| All data will be shared through a this [repository](ttps://huggingface.co/datasets/isp-uv-es/rtm_emulation/tree/main). After the challenge finishes, participants | |
| will also have access to the evaluation scripts on [this GitLab](http://to_be_prepared) to ensure transparency and reproducibility. | |
| ## **Evaluation methodology** | |
| The evaluation will focus on three key aspects: prediction accuracy, computational efficiency, and extrapolation performance. | |
| ### **Prediction Accuracy** | |
| For the **atmospheric correction** scenario (`A`), the predicted atmospheric transfer functions will be used to retrieve surface reflectance from the top-of-atmosphere | |
| (TOA) radiance simulations in the testing dataset. The evaluation will proceed as follows: | |
| 1. The relative difference between retrieved and reference reflectance will be computed for each spectral channel and sample from the testing dataset. | |
| 2. The mean relative error (MRE) will be calculated over the enrire reference dataset to assess overall emulator bias. | |
| 3. The spectrally-averaged MRE (MRE<sub>λ</sub> will be computed, excluding wavelengths in the deep H<sub>2</sub>O. absorption regions, to ensure direct comparability between participants. | |
| For the **CO<sub>2</sub> retrieval** scenario (`B`), evaluation will follow the same steps, comparing predicted TOA radiance spectral data against the reference values | |
| in the testing dataset. | |
| Since each participant/model can contribute to up to four scenario-track combinations, we will consolidate results into a single final ranking using the following process: | |
| 1. **Individual ranking**: For each of the four combinations, submissions will be ranked based on their MRE<sub>λ</sub> values. Lower MRE<sub>λ</sub> values correspond to | |
| better performance. In the unlikely case of ties, these will be handled by averaging the tied ranks. | |
| 2. **Final ranking**: Rankings will be aggregated into a single final score using a weighted average. The following weights will be applied: 0.375 for interpolation and | |
| 0.175 for extrapolation tracks. That is: | |
| **Final score = (0.325 × AC-Interp Rank) + (0.175 × AC-Extrap Rank) + (0.325 × CO2-Interp Rank) + (0.175 × CO2-Extrap Rank)** | |
| 3. **Missing Submissions**: If a participant does not submit results for a particular scenario-track combination, they will be placed in the last position for that track. | |
| To ensure fairness in the final ranking, we will use the **standard competition ranking** method in the case of ties. If two or more participants achieve the same | |
| weighted average rank, they will be assigned the same final position, and the subsequent rank(s) will be skipped accordingly. For example, if two participants are tied | |
| for 1st place, they will both receive rank 1, and the next participant will be ranked 3rd (not 2nd). | |
| **Note:** while the challenge is open, the daily evaluation of error metrics will be done on a subset of the test data. This will avoid participants to have detailed | |
| information that would allow them to fine-tune their models. The final results and ranking evaluated with all the validation data will be provided and the end-date of the challenge. | |
| ### **Computational efficiency** | |
| Participants must report the runtime required to generate predictions across different emulator configurations. The average runtime of all scenario-track combinations | |
| will be calculated and reported in the table. **Runtime won't be taken into account for the final ranking**. After the competition ends, and to facilitate fair | |
| comparisons, participants will be requested to provide a report with hardware specifications, including: CPU, Parallelization settings (e.g., multi-threading, GPU | |
| acceleration), RAM availability. Additionally, participants should report key model characteristics, such as the number of operations required for a single prediction and the number of trainable | |
| parameters in their ML models. | |
| All evaluation scripts will be publicly available on GitLab and Huggingface to ensure fairness, trustworthiness, and transparency. | |
| ### **Proposed Protocol** | |
| - Participant must generate emulator predictions on the provided testing datasets before the submission deadline. Multiple emulator models can be submitted. | |
| - The submission will be made via a [pull request](https://huggingface.co/docs/hub/en/repositories-pull-requests-discussions) to this repository. | |
| - Each submission **MUST** include the prediction results in hdf5 format and a `metadata.json`. | |
| - The predictions should be stored in a `.h5`file with the same format as the [training data](/datasets/isp-uv-es/rtm_emulation#data-availability-and-format). | |
| Note that only the **`LUTdata`** matrix (i.e., the predictions) are needed. A baseline example of this file is available for participants (`baseline_Sn.h5`). | |
| We encourage participants to compress their hdf5 files using the deflate option. | |
| - Each prediction file must be stored in the `results` folder in this repository. The prediction files should be named using the emulator/model name followed by | |
| the scenario-track ID (e.g. `/results/mymodel_A1.h5`). A global attributed named `runtime` must be included to report the | |
| computational efficiency of your model (value expressed in seconds). | |
| Note that all predictions for different scenario-tracks should be stored in separate files. | |
| - The metadata file (`metadata.json`) shall contain the following information: | |
| ```{json} | |
| { | |
| "name": "model_name", | |
| "authors": ["author1", "author2"], | |
| "affiliations": ["affiliation1", "affiliation2"], | |
| "description": "A brief description of the emulator", | |
| "url": "[OPTIONAL] URL to the model repository if it is open-source", | |
| "doi": "DOI to the model publication (if available)", | |
| "email": <main_contact_email> | |
| } | |
| ``` | |
| - Emulator predictions will be evaluated once per day at 12:00 CET based on the defined metrics. | |
| - After the deadline, teams will be contacted with their evaluation results. If any issues are identified, theams will have up to two | |
| weeks to provide the necessary corrections. | |
| - In case of **problems with the pull request** or incorrect validity of the submitted files, all discussions will be held in the [discussion board](https://huggingface.co/isp-uv-es/rtm_emulation/discussions). | |
| - After all the participants have provided the necessary corrections, the results will be published in the discussion section of this repository. | |
| ## **Expected Outcomes** | |
| - No clear superiority of any methodology in all metrics is expected. | |
| - Participants will benefit from the analysis on scenarios/tracks, which will serve them to improve their models. | |
| - A research publication will be submitted to a remote sensing journal with the top three winners. | |
| - An overview paper of the challenge will be published at the [ECML-PKDD 2025](https://ecmlpkdd.org/2025/) workshop proceedings. | |
| - The winner will get covered the registratin cost for the [ECML-PKDD 2025](https://ecmlpkdd.org/2025/). | |
| - We are exploring the possibility to provid an economic prizes for the top three winners. Stay tuned! |