PhysicsNeMo Checkpoints: Atlas

Description:

Atlas is a medium-range weather generative AI model that autoregressively predicts ERA5 variables on a global 0.25 degree latitude-longitude grid.

The model can be sampled multiple times given a single input to produce multiple ensemble members, enabling rapid generation of skillful medium-range ensemble forecasts.

For inference see NVIDIA Earth2Studio.

This model is for research and development only.

License/Terms of Use:

Governing Terms: Use of this model is governed by the NVIDIA Open Model License.

Deployment Geography:

Global

Use Case:

Global medium-range ensemble weather forecasting

Release Date:

Hugging Face: 01/26/2026 via https://huggingface.co/nvidia/atlas-era5

Model Architecture

Architecture Type: Atlas uses a latent diffusion transformer (DiT) architecture, and generates samples using stochastic interpolants.
Network Architecture: Latent diffusion Transformer (DiT), 2.5B parameters, with a decoder DiT using 2D neighborhood attention, 1.8B parameters

Input:

Input Type(s):

  • Tensor (75 state variables from ERA5)
  • DateTime (NumPy Array)

Input Format(s): PyTorch Tensor / NumPy array
Input Parameters:

  • Five Dimensional (5D) (batch, lead time, variable, latitude, longitude)
  • Input DateTime (1D)

Other Properties Related to Input:

  • Input grid (latitude/longitude) is a global 721x1440 equiangular grid.
  • Input lead time is of size 2, including the current time step and the previous time step 6 hours in the past
  • Input state ERA5 variables: u10m, v10m, u100m, v100m, t2m, sp, msl, tcwv, u50, u100, u150, u200, u250, u300, u400, u500, u600, u700, u850 u925, u1000, v50, v100, v150, v200, v250, v300, v400, v500, v600, v700, v850, v925, v1000, z50, z100, z150, z200, z250, z300, z400, z500, z600, z700, z850, z925, z1000, t50, t100, t150, t200, t250, t300, t400, t500, t600, t700, t850, t925, t1000, q50, q100, q150, q200, q250, q300, q400, q500, q600, q700, q850, q925, q1000, sst, tp

For variable naming information, review the Earth2Studio lexicon.

Output:

Output Type(s): Tensor (75 state variables from ERA5)
Output Format: Pytorch Tensors
Output Parameters: Five Dimensional (5D) (batch, lead time, variable, latitude, longitude)
Other Properties Related to Output:

  • Output grid (latitude/longitude) is a global 721x1440 equiangular grid.
  • Output lead time is of size 1, predicting 6 hours in the future.
  • Output state ERA5 variables: u10m, v10m, u100m, v100m, t2m, sp, msl, tcwv, u50, u100, u150, u200, u250, u300, u400, u500, u600, u700, u850 u925, u1000, v50, v100, v150, v200, v250, v300, v400, v500, v600, v700, v850, v925, v1000, z50, z100, z150, z200, z250, z300, z400, z500, z600, z700, z850, z925, z1000, t50, t100, t150, t200, t250, t300, t400, t500, t600, t700, t850, t925, t1000, q50, q100, q150, q200, q250, q300, q400, q500, q600, q700, q850, q925, q1000, sst, tp

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Software Integration

Runtime Engine(s): Not Applicable
Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere
  • NVIDIA Blackwell
  • NVIDIA Hopper

Supported Operating System(s):

  • Linux

The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

Model Version(s):

Model Version: v1

Training, Testing, and Evaluation Datasets:

Note: The initial model development used the years 1980-2016 as a train set, with 2017-2019 as validation data. The final released model was then trained on data from 1980-2019 and evaluated on the year 2020.

Training Dataset:

Link: ERA5

Data Collection Method by dataset:

  • Automatic/Sensors

Labeling Method by dataset:

  • Automatic/Sensors

Data Modality:

  • Gridded geophysical time series

Data Size:

  • 16 TB subset used for model training

Properties: ERA5 data for the period January 1980 - December 2019. ERA5 provides hourly estimates of various atmospheric, land, and oceanic climate variables. The data covers the Earth on a 30km grid and resolves the atmosphere at 137 levels.

Testing Dataset:

Link: ERA5

Data Collection Method by dataset:

  • Automatic/Sensors

Labeling Method by dataset:

  • Automatic/Sensors

Properties: ERA5 data for the period January 2017 - December 2019. ERA5 provides hourly estimates of various atmospheric, land, and oceanic climate variables. The data covers the Earth on a 30km grid and resolves the atmosphere at 137 levels.

Evaluation Dataset:

Link: ERA5

Data Collection Method by dataset:

  • Automatic/Sensors

Labeling Method by dataset:

  • Automatic/Sensors

Properties: ERA5 data for the period January 2020 - December 2020. ERA5 provides hourly estimates of various atmospheric, land, and oceanic climate variables. The data covers the Earth on a 30km grid and resolves the atmosphere at 137 levels.

Inference:

Engine: PyTorch
Test Hardware:

  • A100
  • H100
  • L40S

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.

Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including nvidia/atlas-era5