Earth-2 Checkpoints: CorrDiff-CMIP6-ERA5

Description

Corrector Diffusion (CorrDiff) CMIP6-ERA5 model performs a spatio-temporal downscaling of global climate data comprising several surface, atmospheric, land ice and sea ice variables from the Coupled Model Intercomparison Project Phase 6 (CMIP6) to the European Reanalysis v5 (ERA5). The CMIP6 source data consists of daily variables on multiple regular and curvilinear grids, that are interpolated onto a common 300-km resolution global climate grid. The model downscales the input CMIP6 data onto hourly 25-km resolution data. CorrDiff CMIP6-ERA5 allows the prediction of high-fidelity stochastic climate phenomena over the globe from low-fidelity input data that would otherwise require expensive global numerical simulations.

CorrDiff CMIP6-ERA5 is a generative spatio-temporal downscaling model trained over the globe. For details on the CMIP6 grids, see the CMIP6.

This model is ready for commercial/non-commercial use.

License/Terms of Use:

Governing Terms: Use of this model is governed by the NVIDIA Open Model License.

Deployment Geography:

Global.

Use Case:

Climate scientists accelerating climate prediction with AI, financial institutions and insurance companies for climate risk management, utilities companies for energy planning, and public policy makers for decision-making.

Reference(s)

Codebase

Model Architecture

Architecture Type: U-Net
Network Architecture: Corrector Diffusion U-Net with 158M parameters

Computational Load

Cumulative Compute: 2.3E15 FLOP Estimated Energy and Emissions for Model Training: 4.48266 tCO2e

Input

Input Type(s):

  • Tensor (74 Surface, Atmospheric, and Oceanic Variables from the previous, current, and next day + land-sea mask + elevation + solar zenith angle + distance to the ocean coastline + sine and cosine of the latitude and longitude)
  • Input data hour of the day in 24-hour format

Input Format(s): PyTorch Tensor
Input Parameters:

  • Four Dimensional (4D) (batch, variable, latitude, longitude)
  • Integer (Hour of the day in 24-hour format)

Other Properties Related to Input:

  • 2.8 degree latitude-longitude grid over the globe
  • Input spatial resolution: [64, 128]
  • Input temporal resolution: 24 hours
  • Latitude Coordinates: [90, 87.2, 84.4, ..., -84.4, -87.2, -90]
  • Longitude Coordinates: [0, 2.8, 5.6, ..., 354.4, 357.2, 360]
  • Input weather variables: va10, vas, prc, ua10, ta850, rls, tasmin, wap850, hursmax, ua850, ua50, va850, q10, rlut, va1000, pr, zg1000, sfcWindmax, hurs, ta50, rsus, sfcWind, wap10, ta500, ua100, hus1000, zg500, hus250, ua500, ua1000, hursmin, ta700, va250, hus700, hus100, ua700, wap100, zg100, ta100, va500, tas, ua250, wap1000, zg700, va100, rlds, tasmax, va700, clt, rsds, zg100, ta1000, zg850, uas, wap700, snc, zg50, wap50, zg250, psl, hus50, hus850, hus500, siconc, ts
    For variable name information, review the Lexicon at Earth2Studio.

Output

Output Type(s): Tensor (75 Surface, Atmospheric, and Oceanic Variables)
Output Format: PyTorch Tensor
Output Parameters: 5D (batch, samples, variable, latitude, longitude)
Other Properties Related to Output:

  • 2.8 degree latitude-longitude grid over the globe
  • Output spatial resolution: [721, 1440]
  • Output temporal resolution: 1 hour
  • Output weather variables: u10m, v10m, u100m, v100m, t2m, sp, msl, tcwv, u50, u100, u150, u200, u250, u300, u400, u500, u600, u700, u850, u925, u1000, v50, v100, v150, v200, v250, v300, v400, v500, v600, v700, v850, v925, v1000, z50, z100, z150, z200, z250, z300, z400, z500, z600, z700, z850, z925, z1000, t50, t100, t150, t200, t250, t300, t400, t500, t600, t700, t850, t925, t1000, q50, q100, q150, q200, q250, q300, q400, q500, q600, q700, q850, q925, q1000, sst, d2m

Software Integration

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Runtime Engine(s):

  • PyTorch >= 2.4.0
  • PhysicsNeMo >= 1.2.0

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere
  • NVIDIA Blackwell
  • NVIDIA Hopper
  • NVIDIA Turing

Supported Operating System(s):

  • Linux

Model Version(s)

Model version: v1

Training, Testing, and Evaluation Datasets:

The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

Training Dataset

Link: CMIP6

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):

CMIP6 data for the ranges of 1981-1989, 1991-1999, 2001-2009, 2011-2016. The CMIP6 is a climate dataset with global coverage of the Earth's atmosphere, ocean, and land.

Link: ERA5

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):

ERA5 data for the date range of 1981-1989, 1991-1999, 2001-2009, 2011-2016. The ERA5 is a global hourly reanalysis that blends historical observations with a consistent modern weather model to produce gridded estimates of past atmospheric conditions.

Testing Dataset

Link: CMIP6

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):

CMIP6 data for the years 1980, 1990, 2000. The CMIP6 is a climate dataset with global coverage of the Earth's atmosphere, ocean, and land.

Link: ERA5

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):

ERA5 data for the years 1980, 1990, 2000. The ERA5 is a global hourly reanalysis that blends historical observations with a consistent modern weather model to produce gridded estimates of past atmospheric conditions.

Evaluation Dataset

Link: CMIP6

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):

CMIP6 data for the year 2010. The CMIP6 is a climate dataset with global coverage of the Earth's atmosphere, ocean, and land.

Link: ERA5

Data Collection Method by dataset

  • Automatic/Sensors

Labeling Method by dataset

  • Automatic/Sensors

Properties (Quantity, Dataset Descriptions, Sensor(s)):

ERA5 data for the year 2010. The ERA5 is a global hourly reanalysis that blends historical observations with a consistent modern weather model to produce gridded estimates of past atmospheric conditions.

Inference:

Engine: PyTorch
Test Hardware:

  • A100
  • H100
  • L40S
  • RTX6000

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collections including nvidia/corrdiff-cmip6-era5

Paper for nvidia/corrdiff-cmip6-era5