Earth-2 Checkpoints: CorrDiff-CMIP6-ERA5
Description
Corrector Diffusion (CorrDiff) CMIP6-ERA5 model performs a spatio-temporal downscaling of global climate data comprising several surface, atmospheric, land ice and sea ice variables from the Coupled Model Intercomparison Project Phase 6 (CMIP6) to the European Reanalysis v5 (ERA5). The CMIP6 source data consists of daily variables on multiple regular and curvilinear grids, that are interpolated onto a common 300-km resolution global climate grid. The model downscales the input CMIP6 data onto hourly 25-km resolution data. CorrDiff CMIP6-ERA5 allows the prediction of high-fidelity stochastic climate phenomena over the globe from low-fidelity input data that would otherwise require expensive global numerical simulations.
CorrDiff CMIP6-ERA5 is a generative spatio-temporal downscaling model trained over the globe. For details on the CMIP6 grids, see the CMIP6.
This model is ready for commercial/non-commercial use.
License/Terms of Use:
Governing Terms: Use of this model is governed by the NVIDIA Open Model License.
Deployment Geography:
Global.
Use Case:
Climate scientists accelerating climate prediction with AI, financial institutions and insurance companies for climate risk management, utilities companies for energy planning, and public policy makers for decision-making.
Reference(s)
Codebase
Model Architecture
Architecture Type: U-Net
Network Architecture: Corrector Diffusion U-Net with 158M parameters
Computational Load
Cumulative Compute: 2.3E15 FLOP Estimated Energy and Emissions for Model Training: 4.48266 tCO2e
Input
Input Type(s):
- Tensor (74 Surface, Atmospheric, and Oceanic Variables from the previous,
current, and next day + land-sea mask + elevation + solar zenith angle +
distance to the ocean coastline + sine and cosine of the latitude and
longitude)
- Input data hour of the day in 24-hour format
Input Format(s): PyTorch Tensor
Input Parameters:
- Four Dimensional (4D) (batch, variable, latitude, longitude)
- Integer (Hour of the day in 24-hour format)
Other Properties Related to Input:
- 2.8 degree latitude-longitude grid over the globe
- Input spatial resolution: [64, 128]
- Input temporal resolution: 24 hours
- Latitude Coordinates: [90, 87.2, 84.4, ..., -84.4, -87.2, -90]
- Longitude Coordinates: [0, 2.8, 5.6, ..., 354.4, 357.2, 360]
- Input weather variables: va10, vas, prc, ua10, ta850, rls,
tasmin, wap850, hursmax, ua850, ua50, va850, q10,
rlut, va1000, pr, zg1000, sfcWindmax, hurs, ta50, rsus,
sfcWind, wap10, ta500, ua100, hus1000, zg500,
hus250, ua500, ua1000, hursmin, ta700, va250,
hus700, hus100, ua700, wap100, zg100, ta100,
va500, tas, ua250, wap1000, zg700, va100, rlds,
tasmax, va700, clt, rsds, zg100, ta1000, zg850, uas,
wap700, snc, zg50, wap50, zg250, psl, hus50,
hus850, hus500, siconc, ts
For variable name information, review the Lexicon at Earth2Studio.
Output
Output Type(s): Tensor (75 Surface, Atmospheric, and Oceanic Variables)
Output Format: PyTorch Tensor
Output Parameters: 5D (batch, samples, variable, latitude, longitude)
Other Properties Related to Output:
- 2.8 degree latitude-longitude grid over the globe
- Output spatial resolution: [721, 1440]
- Output temporal resolution: 1 hour
- Output weather variables: u10m, v10m, u100m, v100m, t2m, sp, msl, tcwv, u50, u100,
u150, u200, u250, u300, u400, u500, u600, u700, u850, u925, u1000, v50, v100, v150,
v200, v250, v300, v400, v500, v600, v700, v850, v925, v1000, z50, z100, z150, z200,
z250, z300, z400, z500, z600, z700, z850, z925, z1000, t50, t100, t150, t200, t250,
t300, t400, t500, t600, t700, t850, t925, t1000, q50, q100, q150, q200, q250, q300,
q400, q500, q600, q700, q850, q925, q1000, sst, d2m
Software Integration
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
Runtime Engine(s):
- PyTorch >= 2.4.0
- PhysicsNeMo >= 1.2.0
Supported Hardware Microarchitecture Compatibility:
- NVIDIA Ampere
- NVIDIA Blackwell
- NVIDIA Hopper
- NVIDIA Turing
Supported Operating System(s):
- Linux
Model Version(s)
Model version: v1
Training, Testing, and Evaluation Datasets:
The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.
Training Dataset
Link: CMIP6
Data Collection Method by dataset
- Automatic/Sensors
Labeling Method by dataset
- Automatic/Sensors
Properties (Quantity, Dataset Descriptions, Sensor(s)):
CMIP6 data for the ranges of 1981-1989, 1991-1999, 2001-2009, 2011-2016. The CMIP6 is a
climate dataset with global coverage of the Earth's atmosphere, ocean, and land.
Link: ERA5
Data Collection Method by dataset
- Automatic/Sensors
Labeling Method by dataset
- Automatic/Sensors
Properties (Quantity, Dataset Descriptions, Sensor(s)):
ERA5 data for the date range of 1981-1989, 1991-1999, 2001-2009, 2011-2016. The
ERA5 is a global hourly reanalysis that blends historical observations with a
consistent modern weather model to produce gridded estimates of past
atmospheric conditions.
Testing Dataset
Link: CMIP6
Data Collection Method by dataset
- Automatic/Sensors
Labeling Method by dataset
- Automatic/Sensors
Properties (Quantity, Dataset Descriptions, Sensor(s)):
CMIP6 data for the years 1980, 1990, 2000. The CMIP6 is a
climate dataset with global coverage of the Earth's atmosphere, ocean, and
land.
Link: ERA5
Data Collection Method by dataset
- Automatic/Sensors
Labeling Method by dataset
- Automatic/Sensors
Properties (Quantity, Dataset Descriptions, Sensor(s)):
ERA5 data for the years 1980, 1990, 2000. The ERA5 is a global
hourly reanalysis that blends historical observations with a consistent modern
weather model to produce gridded estimates of past atmospheric conditions.
Evaluation Dataset
Link: CMIP6
Data Collection Method by dataset
- Automatic/Sensors
Labeling Method by dataset
- Automatic/Sensors
Properties (Quantity, Dataset Descriptions, Sensor(s)):
CMIP6 data for the year 2010. The CMIP6 is a
climate dataset with global coverage of the Earth's atmosphere, ocean, and
land.
Link: ERA5
Data Collection Method by dataset
- Automatic/Sensors
Labeling Method by dataset
- Automatic/Sensors
Properties (Quantity, Dataset Descriptions, Sensor(s)):
ERA5 data for the year 2010. The ERA5 is a global
hourly reanalysis that blends historical observations with a consistent modern
weather model to produce gridded estimates of past atmospheric conditions.
Inference:
Engine: PyTorch
Test Hardware:
- A100
- H100
- L40S
- RTX6000
Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.
- Downloads last month
- -