new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Mar 10

New Adaptive Numerical Methods Based on Dual Formulation of Hyperbolic Conservation Laws

In this paper, we propose an adaptive high-order method for hyperbolic systems of conservation laws. The proposed method is based on a dual formulation approach: Two numerical solutions, corresponding to conservative and nonconservative formulations of the same system, are evolved simultaneously. Since nonconservative schemes are known to produce nonphysical weak solutions near discontinuities, we exploit the difference between these two solutions to construct a smoothness indicator (SI). In smooth regions, the difference between the conservative and nonconservative solutions is of the same order as the truncation error of the underlying discretization, whereas in nonsmooth regions, it is {cal O}(1). We apply this idea to the Euler equations of gas dynamics and define the SI using differences in the momentum and pressure variables. This choice allows us to further distinguish neighborhoods of contact discontinuities from other nonsmooth parts of the computed solution. The resulting classification is used to adaptively select numerical discretizations. In the vicinities of contact discontinuities, we employ the low-dissipation central-upwind numerical flux and a second-order piecewise linear reconstruction with the slopes computed using an overcompressive SBM limiter. Elsewhere, we use an alternative weighted essentially non-oscillatory (A-WENO) framework with the central-upwind finite-volume numerical fluxes and either unlimited (in smooth regions) or Ai-WENO-Z (in the nonsmooth regions away from contact discontinuities) fifth-order interpolation. Numerical results for the one- and two-dimensional compressible Euler equations show that the proposed adaptive method improves both the computational efficiency and resolution of complex flow features compared with the non-adaptive fifth-order A-WENO scheme.

  • 4 authors
·
Jan 27

Machine Learning Parameterization of the Multi-scale Kain-Fritsch (MSKF) Convection Scheme

Warm-sector heavy rainfall often occurs along the coast of South China, and it is usually localized and long-lasting, making it challenging to predict. High-resolution numerical weather prediction (NWP) models are increasingly used to better resolve topographic features and forecast such high-impact weather events. However, when the grid spacing becomes comparable to the length scales of convection, known as the gray zone, the turbulent eddies in the atmospheric boundary layer are only partially resolved and parameterized to some extent. Whether using a convection parameterization (CP) scheme in the gray zone remains controversial. Scale-aware CP schemes are developed to enhance the representation of convective transport within the gray zone. The multi-scale Kain-Fritsch (MSKF) scheme includes modifications that allow for its effective implementation at a grid resolution as high as 2 km. In recent years, there has been an increasing application of machine learning (ML) models to various domains of atmospheric sciences, including the replacement of physical parameterizations with ML models. This work proposes a multi-output bidirectional long short-term memory (Bi-LSTM) model as a replace the scale-aware MSKF CP scheme. The Weather Research and Forecast (WRF) model is used to generate training and testing data over South China at a horizontal resolution of 5 km. Furthermore, the WRF model is coupled with the ML based CP scheme and compared with WRF simulations with original MSKF scheme. The results demonstrate that the Bi-LSTM model can achieve high accuracy, indicating the potential use of ML models to substitute the MSKF scheme in the gray zone.

  • 3 authors
·
Nov 6, 2023

A Neural PDE Solver with Temporal Stencil Modeling

Numerical simulation of non-linear partial differential equations plays a crucial role in modeling physical science and engineering phenomena, such as weather, climate, and aerodynamics. Recent Machine Learning (ML) models trained on low-resolution spatio-temporal signals have shown new promises in capturing important dynamics in high-resolution signals, under the condition that the models can effectively recover the missing details. However, this study shows that significant information is often lost in the low-resolution down-sampled features. To address such issues, we propose a new approach, namely Temporal Stencil Modeling (TSM), which combines the strengths of advanced time-series sequence modeling (with the HiPPO features) and state-of-the-art neural PDE solvers (with learnable stencil modeling). TSM aims to recover the lost information from the PDE trajectories and can be regarded as a temporal generalization of classic finite volume methods such as WENO. Our experimental results show that TSM achieves the new state-of-the-art simulation accuracy for 2-D incompressible Navier-Stokes turbulent flows: it significantly outperforms the previously reported best results by 19.9% in terms of the highly-correlated duration time and reduces the inference latency into 80%. We also show a strong generalization ability of the proposed method to various out-of-distribution turbulent flow settings. Our code is available at "https://github.com/Edward-Sun/TSM-PDE".

  • 3 authors
·
Feb 16, 2023

Space and Time Continuous Physics Simulation From Partial Observations

Modern techniques for physical simulations rely on numerical schemes and mesh-refinement methods to address trade-offs between precision and complexity, but these handcrafted solutions are tedious and require high computational power. Data-driven methods based on large-scale machine learning promise high adaptivity by integrating long-range dependencies more directly and efficiently. In this work, we focus on fluid dynamics and address the shortcomings of a large part of the literature, which are based on fixed support for computations and predictions in the form of regular or irregular grids. We propose a novel setup to perform predictions in a continuous spatial and temporal domain while being trained on sparse observations. We formulate the task as a double observation problem and propose a solution with two interlinked dynamical systems defined on, respectively, the sparse positions and the continuous domain, which allows to forecast and interpolate a solution from the initial condition. Our practical implementation involves recurrent GNNs and a spatio-temporal attention observer capable of interpolating the solution at arbitrary locations. Our model not only generalizes to new initial conditions (as standard auto-regressive models do) but also performs evaluation at arbitrary space and time locations. We evaluate on three standard datasets in fluid dynamics and compare to strong baselines, which are outperformed both in classical settings and in the extended new task requiring continuous predictions.

  • 4 authors
·
Jan 17, 2024

Solving Navier-Stokes Equations Using Data-free Physics-Informed Neural Networks With Hard Boundary Conditions

In recent years, Physics-Informed Neural Networks (PINNs) have emerged as a powerful and robust framework for solving nonlinear differential equations across a wide range of scientific and engineering disciplines, including biology, geophysics, astrophysics and fluid dynamics. In the PINN framework, the governing partial differential equations, along with initial and boundary conditions, are encoded directly into the loss function, enabling the network to learn solutions that are consistent with the underlying physics. In this work, we employ the PINN framework to solve the dimensionless Navier-Stokes equations for three two-dimensional incompressible, steady, laminar flow problems without using any labeled data. The boundary and initial conditions are enforced in a hard manner, ensuring they are satisfied exactly rather than penalized during training. We validate the PINN predicted velocity profiles, drag coefficients and pressure profiles against the conventional computational fluid dynamics (CFD) simulations for moderate to high values of Reynolds number (Re). It is observed that the PINN predictions show good agreement with the CFD results at lower Re. We also extend our analysis to a transient condition and find that our method is equally capable of simulating complex time-dependent flow dynamics. To quantitatively assess the accuracy, we compute the L_2 normalized error, which lies in the range O(10^{-4}) - O(10^{-1}) for our chosen case studies.

  • 4 authors
·
Nov 18, 2025

Atmospheric Transport Modeling of CO_2 with Neural Networks

Accurately describing the distribution of CO_2 in the atmosphere with atmospheric tracer transport models is essential for greenhouse gas monitoring and verification support systems to aid implementation of international climate agreements. Large deep neural networks are poised to revolutionize weather prediction, which requires 3D modeling of the atmosphere. While similar in this regard, atmospheric transport modeling is subject to new challenges. Both, stable predictions for longer time horizons and mass conservation throughout need to be achieved, while IO plays a larger role compared to computational costs. In this study we explore four different deep neural networks (UNet, GraphCast, Spherical Fourier Neural Operator and SwinTransformer) which have proven as state-of-the-art in weather prediction to assess their usefulness for atmospheric tracer transport modeling. For this, we assemble the CarbonBench dataset, a systematic benchmark tailored for machine learning emulators of Eulerian atmospheric transport. Through architectural adjustments, we decouple the performance of our emulators from the distribution shift caused by a steady rise in atmospheric CO_2. More specifically, we center CO_2 input fields to zero mean and then use an explicit flux scheme and a mass fixer to assure mass balance. This design enables stable and mass conserving transport for over 6 months with all four neural network architectures. In our study, the SwinTransformer displays particularly strong emulation skill (90-day R^2 > 0.99), with physically plausible emulation even for forward runs of multiple years. This work paves the way forward towards high resolution forward and inverse modeling of inert trace gases with neural networks.

  • 6 authors
·
Aug 20, 2024

Towards scalable surrogate models based on Neural Fields for large scale aerodynamic simulations

This paper introduces a novel surrogate modeling framework for aerodynamic applications based on Neural Fields. The proposed approach, MARIO (Modulated Aerodynamic Resolution Invariant Operator), addresses non parametric geometric variability through an efficient shape encoding mechanism and exploits the discretization-invariant nature of Neural Fields. It enables training on significantly downsampled meshes, while maintaining consistent accuracy during full-resolution inference. These properties allow for efficient modeling of diverse flow conditions, while reducing computational cost and memory requirements compared to traditional CFD solvers and existing surrogate methods. The framework is validated on two complementary datasets that reflect industrial constraints. First, the AirfRANS dataset consists in a two-dimensional airfoil benchmark with non-parametric shape variations. Performance evaluation of MARIO on this case demonstrates an order of magnitude improvement in prediction accuracy over existing methods across velocity, pressure, and turbulent viscosity fields, while accurately capturing boundary layer phenomena and aerodynamic coefficients. Second, the NASA Common Research Model features three-dimensional pressure distributions on a full aircraft surface mesh, with parametric control surface deflections. This configuration confirms MARIO's accuracy and scalability. Benchmarking against state-of-the-art methods demonstrates that Neural Field surrogates can provide rapid and accurate aerodynamic predictions under the computational and data limitations characteristic of industrial applications.

  • 6 authors
·
May 14, 2025

AB-UPT: Scaling Neural CFD Surrogates for High-Fidelity Automotive Aerodynamics Simulations via Anchored-Branched Universal Physics Transformers

Recent advances in neural surrogate modeling offer the potential for transformative innovations in applications such as automotive aerodynamics. Yet, industrial-scale problems often involve volumetric meshes with cell counts reaching the 100 millions, presenting major scalability challenges. Complex geometries further complicate modeling through intricate surface-volume interactions, while quantities such as vorticity are highly nonlinear and must satisfy strict divergence-free constraints. To address these requirements, we introduce AB-UPT as a novel modeling scheme for building neural surrogates for CFD simulations. AB-UPT is designed to: (i) decouple geometry encoding and prediction tasks via multi-branch operators; (ii) enable scalability to high-resolution outputs via neural simulation in a low-dimensional latent space, coupled with anchored neural field decoders to predict high-fidelity outputs; (iii) enforce physics consistency by a novel divergence-free formulation. We show that AB-UPT yields state-of-the-art predictive accuracy of surface and volume fields on automotive CFD simulations ranging from 33 thousand up to 150 million mesh cells. Furthermore, our anchored neural field architecture enables the enforcement of hard physical constraints on the physics predictions without degradation in performance, exemplified by modeling divergence-free vorticity fields. Notably, the proposed models can be trained on a single GPU in less than a day and predict industry-standard surface and volume fields within seconds. Additionally, we show that the flexible design of our method enables neural simulation from a CAD geometry alone, omitting the need for costly CFD meshing procedures.

  • 7 authors
·
Feb 13, 2025

LESnets (Large-Eddy Simulation nets): Physics-informed neural operator for large-eddy simulation of turbulence

Acquisition of large datasets for three-dimensional (3D) partial differential equations are usually very expensive. Physics-informed neural operator (PINO) eliminates the high costs associated with generation of training datasets, and shows great potential in a variety of partial differential equations. In this work, we employ physics-informed neural operator, encoding the large-eddy simulation (LES) equations directly into the neural operator for simulating three-dimensional incompressible turbulent flows. We develop the LESnets (Large-Eddy Simulation nets) by adding large-eddy simulation equations to two different data-driven models, including Fourier neural operator (FNO) and implicit Fourier neural operator (IFNO) without using label data. Notably, by leveraging only PDE constraints to learn the spatio-temporal dynamics problem, LESnets retains the computational efficiency of data-driven approaches while obviating the necessity for data. Meanwhile, using large-eddy simulation equations as PDE constraints makes it possible to efficiently predict complex turbulence at coarse grids. We investigate the performance of the LESnets with two standard three-dimensional turbulent flows: decaying homogeneous isotropic turbulence and temporally evolving turbulent mixing layer. In the numerical experiments, the LESnets model shows a similar or even better accuracy as compared to traditional large-eddy simulation and data-driven models of FNO and IFNO. Moreover, the well-trained LESnets is significantly faster than traditional LES, and has a similar efficiency as the data-driven FNO and IFNO models. Thus, physics-informed neural operators have a strong potential for 3D nonlinear engineering applications.

  • 6 authors
·
Nov 7, 2024

Going with the Speed of Sound: Pushing Neural Surrogates into Highly-turbulent Transonic Regimes

The widespread use of neural surrogates in automotive aerodynamics, enabled by datasets such as DrivAerML and DrivAerNet++, has primarily focused on bluff-body flows with large wakes. Extending these methods to aerospace, particularly in the transonic regime, remains challenging due to the high level of non-linearity of compressible flows and 3D effects such as wingtip vortices. Existing aerospace datasets predominantly focus on 2D airfoils, neglecting these critical 3D phenomena. To address this gap, we present a new dataset of CFD simulations for 3D wings in the transonic regime. The dataset comprises volumetric and surface-level fields for around 30,000 samples with unique geometry and inflow conditions. This allows computation of lift and drag coefficients, providing a foundation for data-driven aerodynamic optimization of the drag-lift Pareto front. We evaluate several state-of-the-art neural surrogates on our dataset, including Transolver and AB-UPT, focusing on their out-of-distribution (OOD) generalization over geometry and inflow variations. AB-UPT demonstrates strong performance for transonic flowfields and reproduces physically consistent drag-lift Pareto fronts even for unseen wing configurations. Our results demonstrate that AB-UPT can approximate drag-lift Pareto fronts for unseen geometries, highlighting its potential as an efficient and effective tool for rapid aerodynamic design exploration. To facilitate future research, we open-source our dataset at https://huggingface.co/datasets/EmmiAI/Emmi-Wing.

  • 8 authors
·
Nov 26, 2025

Eulerian-Lagrangian particle-based model for diffusional growth for the better parameterization of ISM clouds: A road map for improving climate model through small-scale model using observations

The quantitative prediction of the intensity of rainfall events (light or heavy) has remained a challenge in Numerical Weather Prediction (NWP) models. For the first time the mean coefficient of diffusional growth rates are calculated using an Eulerian-Lagrangian particle-based small-scale model on in situ airborne measurement data of Cloud Aerosol Interaction and Precipitation Enhancement Experiment (CAIPEEX) during monsoon over Indian sub-continent. The results show that diffusional growth rates varies in the range of 0.00025 - 0.0015(cm/s). The generic problem of the overestimation of light rain in NWP models might be related with the choice of cm in the model. It is also shown from DNS experiment using Eulerian-Lagrangian particle-based small-scale model that the relative dispersion is constrained with average values in the range of ~ 0.2 - 0.37 (~ 0.1- 0.26) in less humid (more humid) conditions. This is in agreement with in situ airborne observation (dispersion ~ 0.36) and previous study over Indian sub-continent. The linear relationship between relative dispersion and cloud droplet number concentration (NC) is obtained from this study using CAIPEEX observation over Indian subcontinent. The dispersion based autoconversion-scheme for Indian region must be useful for the Indian summer monsoon precipitation calculation in the general circulation model. The present study also provide valuable guidance for the parameterization of effective radius, important for radiation scheme.

  • 4 authors
·
Mar 2, 2023

Open-source Flux Transport (OFT). I. HipFT -- High-performance Flux Transport

Global solar photospheric magnetic maps play a critical role in solar and heliospheric physics research. Routine magnetograph measurements of the field occur only along the Sun-Earth line, leaving the far-side of the Sun unobserved. Surface Flux Transport (SFT) models attempt to mitigate this by modeling the surface evolution of the field. While such models have long been established in the community (with several releasing public full-Sun maps), none are open source. The Open Source Flux Transport (OFT) model seeks to fill this gap by providing an open and user-extensible SFT model that also builds on the knowledge of previous models with updated numerical and data acquisition/assimilation methods along with additional user-defined features. In this first of a series of papers on OFT, we introduce its computational core: the High-performance Flux Transport (HipFT) code (github.com/predsci/hipft). HipFT implements advection, diffusion, and data assimilation in a modular design that supports a variety of flow models and options. It can compute multiple realizations in a single run across model parameters to create ensembles of maps for uncertainty quantification and is high-performance through the use of multi-CPU and multi-GPU parallelism. HipFT is designed to enable users to easily write extensions, enhancing its flexibility and adaptability. We describe HipFT's model features, validations of its numerical methods, performance of its parallel and GPU-accelerated code implementation, analysis/post-processing options, and example use cases.

  • 8 authors
·
Jan 10, 2025

ClimaX: A foundation model for weather and climate

Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere. These approaches aim to model the non-linear dynamics and complex interactions between multiple variables, which are challenging to approximate. Additionally, many such numerical models are computationally intensive, especially when modeling the atmospheric phenomenon at a fine-grained spatial and temporal resolution. Recent data-driven approaches based on machine learning instead aim to directly solve a downstream forecasting or projection task by learning a data-driven functional mapping using deep neural networks. However, these networks are trained using curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of numerical models. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatio-temporal coverage, and physical groundings. ClimaX extends the Transformer architecture with novel encoding and aggregation blocks that allow effective use of available compute while maintaining general utility. ClimaX is pre-trained with a self-supervised learning objective on climate datasets derived from CMIP6. The pre-trained ClimaX can then be fine-tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatio-temporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in ClimaX results in superior performance on benchmarks for weather forecasting and climate projections, even when pretrained at lower resolutions and compute budgets.

  • 5 authors
·
Jan 24, 2023

Geometry aware inference of steady state PDEs using Equivariant Neural Fields representations

Recent advances in Neural Fields have enabled powerful, discretization-invariant methods for learning neural operators that approximate solutions of Partial Differential Equations (PDEs) on general geometries. Building on these developments, we introduce enf2enf, an encoder--decoder methodology for predicting steady-state Partial Differential Equations with non-parameterized geometric variability, based on recently proposed Equivariant Neural Field architectures. In enf2enf, input geometries are encoded into latent point cloud embeddings that inherently preserve geometric grounding and capture local phenomena. The resulting representations are then combined with global parameters and directly decoded into continuous output fields, thus efficiently modeling the coupling between geometry and physics. By leveraging the inductive biases of locality and translation invariance, our approach is able to capture fine-scale physical features as well as complex shape variations, thereby enhancing generalization and physical compliance. Extensive experiments on a high-fidelity aerodynamic dataset, a hyper-elastic material benchmark, and multi-element airfoil geometries, demonstrate that the proposed model achieves superior or competitive performance compared to state-of-the-art graph based, operator learning, and neural field methods. Notably, our method supports real time inference and zero-shot super-resolution, enabling efficient training on low-resolution meshes while maintaining high accuracy on full-scale discretizations.

  • 5 authors
·
Apr 24, 2025

Coherent Structures Governing Transport at Turbulent Interfaces

In an experiment on a turbulent jet, we detect interfacial turbulent layers in a frame that moves, on average, along with the \tnti. This significantly prolongs the observation time of scalar and velocity structures and enables the measurement of two types of Lagrangian coherent structures. One structure, the finite-time Lyapunov field (FTLE), quantifies advective transport barriers of fluid parcels while the other structure highlights barriers of diffusive momentum transport. These two complementary structures depend on large-scale and small-scale motion and are therefore associated with the growth of the turbulent region through engulfment or nibbling, respectively. We detect the \tnti\ from cluster analysis, where we divide the measured scalar field into four clusters. Not only the \tnti\ can be found this way, but also the next, internal, turbulent-turbulent interface. Conditional averages show that these interfaces are correlated with barriers of advective and diffusive transport when the Lagrangian integration time is smaller than the integral time scale. Diffusive structures decorrelate faster since they have a smaller timescale. Conditional averages of these structures at internal turbulent-turbulent interfaces show the same pattern with a more pronounced jump at the interface indicative of a shear layer. This is quite an unexpected outcome, as the internal interface is now defined not by the presence or absence of vorticity, but by conditional vorticity corresponding to two uniform concentration zones. The long-time diffusive momentum flux along Lagrangian paths represents the growth of the turbulent flow into the irrotational domain, a direct demonstration of nibbling. The diffusive flux parallel to the \tnti\ appears to be concentrated in a diffusive superlayer whose width is comparable with the Taylor microscale, which is relatively invariant in time.

  • 5 authors
·
Dec 17, 2024

Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling

Storm-scale convection-allowing models (CAMs) are an important tool for predicting the evolution of thunderstorms and mesoscale convective systems that result in damaging extreme weather. By explicitly resolving convective dynamics within the atmosphere they afford meteorologists the nuance needed to provide outlook on hazard. Deep learning models have thus far not proven skilful at km-scale atmospheric simulation, despite being competitive at coarser resolution with state-of-the-art global, medium-range weather forecasting. We present a generative diffusion model called StormCast, which emulates the high-resolution rapid refresh (HRRR) model-NOAA's state-of-the-art 3km operational CAM. StormCast autoregressively predicts 99 state variables at km scale using a 1-hour time step, with dense vertical resolution in the atmospheric boundary layer, conditioned on 26 synoptic variables. We present evidence of successfully learnt km-scale dynamics including competitive 1-6 hour forecast skill for composite radar reflectivity alongside physically realistic convective cluster evolution, moist updrafts, and cold pool morphology. StormCast predictions maintain realistic power spectra for multiple predicted variables across multi-hour forecasts. Together, these results establish the potential for autoregressive ML to emulate CAMs -- opening up new km-scale frontiers for regional ML weather prediction and future climate hazard dynamical downscaling.

  • 11 authors
·
Aug 20, 2024

Observational signatures of mixing-induced cooling in the Kelvin-Helmholtz instability

Cool (approx 10^4K), dense material permeates the hot (approx 10^6K), tenuous solar corona in form of coronal condensations, for example prominences and coronal rain. As the solar atmosphere evolves, turbulence can drive mixing between the condensations and the surrounding corona, with the mixing layer exhibiting an enhancement in emission from intermediate temperature (approx10^5K) spectral lines, which is often attributed to turbulent heating within the mixing layer. However, radiative cooling is highly efficient at intermediate temperatures and numerical simulations have shown that radiative cooling can far exceed turbulent heating in prominence-corona mixing scenarios. As such the mixing layer can have a net loss of thermal energy, i.e., the mixing layer is cooling rather than heating. Here, we investigate the observational signatures of cooling processes in Kelvin-Helmholtz mixing between a prominence thread and the surrounding solar corona through 2D numerical simulations. Optically thin emission is synthesised for Si IV, along with optically thick emission for Halpha, Ca II K and Mg II h using Lightweaver The Mg II h probes the turbulent mixing layer, whereas Halpha and Ca II K form within the thread and along its boundary respectively. As the mixing evolves, intermediate temperatures form leading to an increase in Si IV emission, which coincides with increased radiative losses. The simulation is dominated by cooling in the mixing layer, rather than turbulent heating, and yet enhanced emission in warm lines is produced. As such, an observational signature of decreased emission in cooler lines and increased emission in hotter lines may be a signature of mixing, rather than an implication of heating.

  • 3 authors
·
Jan 20, 2025

GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations

Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport.GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3Dleftrightarrow5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.

Interpretable structural model error discovery from sparse assimilation increments using spectral bias-reduced neural networks: A quasi-geostrophic turbulence test case

Earth system models suffer from various structural and parametric errors in their representation of nonlinear, multi-scale processes, leading to uncertainties in their long-term projections. The effects of many of these errors (particularly those due to fast physics) can be quantified in short-term simulations, e.g., as differences between the predicted and observed states (analysis increments). With the increase in the availability of high-quality observations and simulations, learning nudging from these increments to correct model errors has become an active research area. However, most studies focus on using neural networks, which while powerful, are hard to interpret, are data-hungry, and poorly generalize out-of-distribution. Here, we show the capabilities of Model Error Discovery with Interpretability and Data Assimilation (MEDIDA), a general, data-efficient framework that uses sparsity-promoting equation-discovery techniques to learn model errors from analysis increments. Using two-layer quasi-geostrophic turbulence as the test case, MEDIDA is shown to successfully discover various linear and nonlinear structural/parametric errors when full observations are available. Discovery from spatially sparse observations is found to require highly accurate interpolation schemes. While NNs have shown success as interpolators in recent studies, here, they are found inadequate due to their inability to accurately represent small scales, a phenomenon known as spectral bias. We show that a general remedy, adding a random Fourier feature layer to the NN, resolves this issue enabling MEDIDA to successfully discover model errors from sparse observations. These promising results suggest that with further development, MEDIDA could be scaled up to models of the Earth system and real observations.

  • 3 authors
·
Sep 22, 2023

PhysicsFormer: An Efficient and Fast Attention-Based Physics Informed Neural Network for Solving Incompressible Navier Stokes Equations

Traditional experimental and numerical approaches for fluid dynamics problems often suffer from high computational cost, mesh sensitivity, and limited capability in capturing complex physical behaviors. Moreover, conventional physics-informed neural networks (PINNs) frequently struggle in chaotic and highly unsteady flow regimes. In this work, we propose PhysicsFormer, a fast and efficient transformer-based physics-informed framework that incorporates multi-head encoder-decoder cross-attention. Unlike multilayer perceptron-based PINNs, PhysicsFormer operates on sequential representations constructed from spatio-temporal data, enabling effective learning of long-range temporal dependencies and improved propagation of initial condition information. A data-embedding strategy is employed to convert spatio-temporal points into pseudo-sequences, while a dynamics-weighted loss function replaces the standard PINNs formulation. Owing to its parallel learning structure, PhysicsFormer demonstrates superior computational efficiency compared to existing transformer-based approaches. The framework is validated on Burgers' equation and flow reconstruction governed by the Navier-Stokes equations, achieving mean squared errors on the order of 10^{-6}. In addition, an inverse problem involving parameter identification in the two-dimensional incompressible Navier-Stokes equations is investigated. For clean data, PhysicsFormer achieves zero identification error for both λ_1 and λ_2; under 1% Gaussian noise, the errors are 0.07% for λ_1 and 0% for λ_2. These results demonstrate that PhysicsFormer provides a reliable and computationally efficient surrogate modeling framework for time-dependent fluid flow problems.

  • 3 authors
·
Jan 7

Fully Compressible Magnetohydrodynamic Simulations of Solar Convection Zones with CHORUS++

The objective of this study is to develop a fully compressible magnetohydrodynamic solver for fast simulations of the global dynamo of the Sun using unstructured grids and GPUs. Accurate modeling of the Sun's convective layers is vital to predicting the Sun's behavior, including the solar dynamo and sunspot cycles. Currently, there are many efficient codes capable of conducting these large simulations; however, many assume an anealastic density distribution. The anelastic assumption is capable of producing accurate results for low mach numbers; however, it fails in regions with a higher mach number and a fully compressible flow must be considered. To avoid these issues, Wang et al. [1] created a Compressible High-ORder Unstructured Spectral difference (CHORUS) code for simulating fluid dynamics inside stars and planets. CHORUS++ augmented the CHORUS code to adopt a higher degree of polynomials by using cubed-sphere meshing and transfinite mapping to perform simulations on unstructured grids [2]. Recently, CHORUS++ was further developed for parallel magnetohydrodynamic (MHD) solutions on GPUs at Clarkson University. In this study the solar benchmark problems presented by Chen et al. [2] are extended to unsteady solar dynamo problems, with two different density scale heights. The CHORUS-MHD code is further accelerated by multiple GPUs and used to successfully solve these solar dynamo benchmark problems. [1] Wang, J., Liang, C., and Miesch, M. S., "A Compressible High-Order Unstructured Spectral Difference Code for Stratified Convection in Rotating Spherical Shells," Journal of Computational Physics, Vol. 290, 2015, pp. 90-111. [2] Chen, K., Liang, C., and Wan, M., "Arbitrarily high-order accurate simulations of compressible rotationally constrained convection using a transfinite mapping on cubed-sphere grids," Physics of Fluids, Vol. 35, 2023, p. 086120.

  • 2 authors
·
Feb 24, 2025

Respecting causality is all you need for training physics-informed neural networks

While the popularity of physics-informed neural networks (PINNs) is steadily rising, to this date PINNs have not been successful in simulating dynamical systems whose solution exhibits multi-scale, chaotic or turbulent behavior. In this work we attribute this shortcoming to the inability of existing PINNs formulations to respect the spatio-temporal causal structure that is inherent to the evolution of physical systems. We argue that this is a fundamental limitation and a key source of error that can ultimately steer PINN models to converge towards erroneous solutions. We address this pathology by proposing a simple re-formulation of PINNs loss functions that can explicitly account for physical causality during model training. We demonstrate that this simple modification alone is enough to introduce significant accuracy improvements, as well as a practical quantitative mechanism for assessing the convergence of a PINNs model. We provide state-of-the-art numerical results across a series of benchmarks for which existing PINNs formulations fail, including the chaotic Lorenz system, the Kuramoto-Sivashinsky equation in the chaotic regime, and the Navier-Stokes equations in the turbulent regime. To the best of our knowledge, this is the first time that PINNs have been successful in simulating such systems, introducing new opportunities for their applicability to problems of industrial complexity.

  • 3 authors
·
Mar 14, 2022

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

We present a novel deep learning approach to approximate the solution of large, sparse, symmetric, positive-definite linear systems of equations. These systems arise from many problems in applied science, e.g., in numerical methods for partial differential equations. Algorithms for approximating the solution to these systems are often the bottleneck in problems that require their solution, particularly for modern applications that require many millions of unknowns. Indeed, numerical linear algebra techniques have been investigated for many decades to alleviate this computational burden. Recently, data-driven techniques have also shown promise for these problems. Motivated by the conjugate gradients algorithm that iteratively selects search directions for minimizing the matrix norm of the approximation error, we design an approach that utilizes a deep neural network to accelerate convergence via data-driven improvement of the search directions. Our method leverages a carefully chosen convolutional network to approximate the action of the inverse of the linear operator up to an arbitrary constant. We train the network using unsupervised learning with a loss function equal to the L^2 difference between an input and the system matrix times the network evaluation, where the unspecified constant in the approximate inverse is accounted for. We demonstrate the efficacy of our approach on spatially discretized Poisson equations with millions of degrees of freedom arising in computational fluid dynamics applications. Unlike state-of-the-art learning approaches, our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations, independent of the problem size. Moreover, our method generalizes effectively to various systems beyond those encountered during training.

  • 6 authors
·
May 22, 2022

FengWu-GHR: Learning the Kilometer-scale Medium-range Global Weather Forecasting

Kilometer-scale modeling of global atmosphere dynamics enables fine-grained weather forecasting and decreases the risk of disastrous weather and climate activity. Therefore, building a kilometer-scale global forecast model is a persistent pursuit in the meteorology domain. Active international efforts have been made in past decades to improve the spatial resolution of numerical weather models. Nonetheless, developing the higher resolution numerical model remains a long-standing challenge due to the substantial consumption of computational resources. Recent advances in data-driven global weather forecasting models utilize reanalysis data for model training and have demonstrated comparable or even higher forecasting skills than numerical models. However, they are all limited by the resolution of reanalysis data and incapable of generating higher-resolution forecasts. This work presents FengWu-GHR, the first data-driven global weather forecasting model running at the 0.09^{circ} horizontal resolution. FengWu-GHR introduces a novel approach that opens the door for operating ML-based high-resolution forecasts by inheriting prior knowledge from a pretrained low-resolution model. The hindcast of weather prediction in 2022 indicates that FengWu-GHR is superior to the IFS-HRES. Furthermore, evaluations on station observations and case studies of extreme events support the competitive operational forecasting skill of FengWu-GHR at the high resolution.

  • 10 authors
·
Jan 28, 2024

Uncertainty Quantification for Multi-fidelity Simulations

The work focuses on gathering high-fidelity and low-fidelity numerical simulations data using Nektar++ (Solver based on Applied Mathematics) and XFOIL respectively. The utilization of the higher polynomial distribution in calculating the Coefficient of lift and drag has demonstrated superior accuracy and precision. Further, Co-kriging Data fusion and Adaptive sampling technique has been used to obtain the precise data predictions for the lift and drag within the confined domain without conducting the costly simulations on HPC clusters. This creates a methodology to quantifying uncertainty in computational fluid dynamics by minimizing the required number of samples. To minimize the reliability on high-fidelity numerical simulations in Uncertainty Quantification, a multi-fidelity strategy has been adopted. The effectiveness of the multi-fidelity deep neural network model has been validated through the approximation of benchmark functions across 1-, 32-, and 100-dimensional, encompassing both linear and nonlinear correlations. The surrogate modelling results showed that multi-fidelity deep neural network model has shown excellent approximation capabilities for the test functions and multi-fidelity deep neural network method has outperformed Co-kriging in effectiveness. In addition to that, multi-fidelity deep neural network model is utilized for the simulation of aleatory uncertainty propagation in 1-, 32-, and 100 dimensional function test, considering both uniform and Gaussian distributions for input uncertainties. The results have shown that multi-fidelity deep neural network model has efficiently predicted the probability density distributions of quantities of interest as well as the statistical moments with precision and accuracy. The Co-Kriging model has exhibited limitations when addressing 32-Dimension problems due to the limitation of memory capacity for storage and manipulation.

  • 1 authors
·
Mar 11, 2025

OneForecast: A Universal Framework for Global and Regional Weather Forecasting

Accurate weather forecasts are important for disaster prevention, agricultural planning, etc. Traditional numerical weather prediction (NWP) methods offer physically interpretable high-accuracy predictions but are computationally expensive and fail to fully leverage rapidly growing historical data. In recent years, deep learning models have made significant progress in weather forecasting, but challenges remain, such as balancing global and regional high-resolution forecasts, excessive smoothing in extreme event predictions, and insufficient dynamic system modeling. To address these issues, this paper proposes a global-regional nested weather forecasting framework (OneForecast) based on graph neural networks. By combining a dynamic system perspective with multi-grid theory, we construct a multi-scale graph structure and densify the target region to capture local high-frequency features. We introduce an adaptive messaging mechanism, using dynamic gating units to deeply integrate node and edge features for more accurate extreme event forecasting. For high-resolution regional forecasts, we propose a neural nested grid method to mitigate boundary information loss. Experimental results show that OneForecast performs excellently across global to regional scales and short-term to long-term forecasts, especially in extreme event predictions. Codes link https://github.com/YuanGao-YG/OneForecast.

  • 14 authors
·
Feb 1, 2025

Deep Learning and Foundation Models for Weather Prediction: A Survey

Physics-based numerical models have been the bedrock of atmospheric sciences for decades, offering robust solutions but often at the cost of significant computational resources. Deep learning (DL) models have emerged as powerful tools in meteorology, capable of analyzing complex weather and climate data by learning intricate dependencies and providing rapid predictions once trained. While these models demonstrate promising performance in weather prediction, often surpassing traditional physics-based methods, they still face critical challenges. This paper presents a comprehensive survey of recent deep learning and foundation models for weather prediction. We propose a taxonomy to classify existing models based on their training paradigms: deterministic predictive learning, probabilistic generative learning, and pre-training and fine-tuning. For each paradigm, we delve into the underlying model architectures, address major challenges, offer key insights, and propose targeted directions for future research. Furthermore, we explore real-world applications of these methods and provide a curated summary of open-source code repositories and widely used datasets, aiming to bridge research advancements with practical implementations while fostering open and trustworthy scientific practices in adopting cutting-edge artificial intelligence for weather prediction. The related sources are available at https://github.com/JimengShi/ DL-Foundation-Models-Weather.

  • 13 authors
·
Jan 12, 2025

What Determines the Brightness of the Magnetically Open Solar Corona?: Insights from Three-dimensional Radiative Magnetohydrodynamic Simulations and Observations

We investigate the relationship between solar coronal holes and open-field regions using three-dimensional radiative magnetohydrodynamic (MHD) simulations combined with remote-sensing observations from the Solar Dynamics Observatory (SDO). Our numerical simulations reveal that magnetically open regions in the corona can exhibit brightness comparable to quiet regions, challenging the conventional view that open-field regions are inherently dark coronal holes. We find that the coronal brightness is primarily determined by the total energy input from photospheric magnetic activities, such as the small-scale dynamo, rather than differences in dissipative processes within the corona. Using synthesized EUV intensity maps, we show that brightness thresholds commonly used to identify coronal holes may overlook open-field regions, especially at lower spatial resolutions. Observational analysis utilizing SDO/HMI and AIA synoptic maps supports our simulation results, demonstrating that magnetic field extrapolation techniques, such as the Potential Field Source Surface (PFSS) model, are sensitive to the chosen parameters, including the source surface height. We suggest that discrepancies in estimates of open magnetic flux (the ``open flux problem'') arise both from the modeling assumptions in coronal magnetic field extrapolation and systematic biases in solar surface magnetic field observations. Our findings indicate the need for reconsidering criteria used to identify coronal holes as indicators of open-field regions to better characterize the solar open magnetic flux.

  • 1 authors
·
Apr 18, 2025

CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics

In recent years, applying deep learning to solve physics problems has attracted much attention. Data-driven deep learning methods produce fast numerical operators that can learn approximate solutions to the whole system of partial differential equations (i.e., surrogate modeling). Although these neural networks may have lower accuracy than traditional numerical methods, they, once trained, are orders of magnitude faster at inference. Hence, one crucial feature is that these operators can generalize to unseen PDE parameters without expensive re-training.In this paper, we construct CFDBench, a benchmark tailored for evaluating the generalization ability of neural operators after training in computational fluid dynamics (CFD) problems. It features four classic CFD problems: lid-driven cavity flow, laminar boundary layer flow in circular tubes, dam flows through the steps, and periodic Karman vortex street. The data contains a total of 302K frames of velocity and pressure fields, involving 739 cases with different operating condition parameters, generated with numerical methods. We evaluate the effectiveness of popular neural operators including feed-forward networks, DeepONet, FNO, U-Net, etc. on CFDBnech by predicting flows with non-periodic boundary conditions, fluid properties, and flow domain shapes that are not seen during training. Appropriate modifications were made to apply popular deep neural networks to CFDBench and enable the accommodation of more changing inputs. Empirical results on CFDBench show many baseline models have errors as high as 300% in some problems, and severe error accumulation when performing autoregressive inference. CFDBench facilitates a more comprehensive comparison between different neural operators for CFD compared to existing benchmarks.

  • 3 authors
·
Sep 13, 2023

Community Research Earth Digital Intelligence Twin (CREDIT)

Recent advancements in artificial intelligence (AI) for numerical weather prediction (NWP) have significantly transformed atmospheric modeling. AI NWP models outperform traditional physics-based systems, such as the Integrated Forecast System (IFS), across several global metrics while requiring fewer computational resources. However, existing AI NWP models face limitations related to training datasets and timestep choices, often resulting in artifacts that reduce model performance. To address these challenges, we introduce the Community Research Earth Digital Intelligence Twin (CREDIT) framework, developed at NSF NCAR. CREDIT provides a flexible, scalable, and user-friendly platform for training and deploying AI-based atmospheric models on high-performance computing systems. It offers an end-to-end pipeline for data preprocessing, model training, and evaluation, democratizing access to advanced AI NWP capabilities. We demonstrate CREDIT's potential through WXFormer, a novel deterministic vision transformer designed to predict atmospheric states autoregressively, addressing common AI NWP issues like compounding error growth with techniques such as spectral normalization, padding, and multi-step training. Additionally, to illustrate CREDIT's flexibility and state-of-the-art model comparisons, we train the FUXI architecture within this framework. Our findings show that both FUXI and WXFormer, trained on six-hourly ERA5 hybrid sigma-pressure levels, generally outperform IFS HRES in 10-day forecasts, offering potential improvements in efficiency and forecast accuracy. CREDIT's modular design enables researchers to explore various models, datasets, and training configurations, fostering innovation within the scientific community.

  • 10 authors
·
Nov 8, 2024

FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling

Similar to conventional video generation, current deep learning-based weather prediction frameworks often lack explicit physical constraints, leading to unphysical outputs that limit their reliability for operational forecasting. Among various physical processes requiring proper representation, radiation plays a fundamental role as it drives Earth's weather and climate systems. However, accurate simulation of radiative transfer processes remains challenging for traditional numerical weather prediction (NWP) models due to their inherent complexity and high computational costs. Here, we propose FuXi-RTM, a hybrid physics-guided deep learning framework designed to enhance weather forecast accuracy while enforcing physical consistency. FuXi-RTM integrates a primary forecasting model (FuXi) with a fixed deep learning-based radiative transfer model (DLRTM) surrogate that efficiently replaces conventional radiation parameterization schemes. This represents the first deep learning-based weather forecasting framework to explicitly incorporate physical process modeling. Evaluated over a comprehensive 5-year dataset, FuXi-RTM outperforms its unconstrained counterpart in 88.51% of 3320 variable and lead time combinations, with improvements in radiative flux predictions. By incorporating additional physical processes, FuXi-RTM paves the way for next-generation weather forecasting systems that are both accurate and physically consistent.

  • 5 authors
·
Mar 25, 2025

Implicit factorized transformer approach to fast prediction of turbulent channel flows

Transformer neural operators have recently become an effective approach for surrogate modeling of systems governed by partial differential equations (PDEs). In this paper, we introduce a modified implicit factorized transformer (IFactFormer-m) model which replaces the original chained factorized attention with parallel factorized attention. The IFactFormer-m model successfully performs long-term predictions for turbulent channel flow, whereas the original IFactFormer (IFactFormer-o), Fourier neural operator (FNO), and implicit Fourier neural operator (IFNO) exhibit a poor performance. Turbulent channel flows are simulated by direct numerical simulation using fine grids at friction Reynolds numbers Re_{tau}approx 180,395,590, and filtered to coarse grids for training neural operator. The neural operator takes the current flow field as input and predicts the flow field at the next time step, and long-term prediction is achieved in the posterior through an autoregressive approach. The results show that IFactFormer-m, compared to other neural operators and the traditional large eddy simulation (LES) methods including dynamic Smagorinsky model (DSM) and the wall-adapted local eddy-viscosity (WALE) model, reduces prediction errors in the short term, and achieves stable and accurate long-term prediction of various statistical properties and flow structures, including the energy spectrum, mean streamwise velocity, root mean square (rms) values of fluctuating velocities, Reynolds shear stress, and spatial structures of instantaneous velocity. Moreover, the trained IFactFormer-m is much faster than traditional LES methods. By analyzing the attention kernels, we elucidate the reasons why IFactFormer-m converges faster and achieves a stable and accurate long-term prediction compared to IFactFormer-o. Code and data are available at: https://github.com/huiyu-2002/IFactFormer-m.

  • 3 authors
·
Dec 25, 2024

A Diagnostic Kit for Optical Emission Lines Shaped by Accretion Disc Winds

Blueshifted absorption is the classic spectroscopic signature of an accretion disc wind in X-ray binaries and cataclysmic variables (CVs). However, outflows can also create pure emission lines, especially at optical wavelengths. Therefore, developing other outflow diagnostics for these types of lines is worthwhile. With this in mind, we construct a systematic grid of 3645 synthetic wind-formed H-alpha line profiles for CVs with the radiative transfer code SIROCCO. Our grid yields a variety of line shapes: symmetric, asymmetric, single- to quadruple-peaked, and even P-Cygni profiles. About 20% of these lines -- our `Gold' sample -- have strengths and widths consistent with observations. We use this grid to test a recently proposed method for identifying wind-formed emission lines based on deviations in the wing profile shape: the `excess equivalent width diagnostic diagram'. We find that our `Gold' sample can preferentially populate the suggested `wind regions' of this diagram. However, the method is highly sensitive to the adopted definition of the line profile `wing'. Hence, we propose a refined definition based on the full-width at half maximum to improve the interpretability of the diagnostic diagram. Furthermore, we define an approximate scaling relation for the strengths of wind-formed CV emission lines in terms of the outflow parameters. This relation provides a fast way to assess whether -- and what kind of -- outflow can produce an observed emission line. All our wind-based models are open-source and we provide an easy-to-use web-based tool to browse our full set of H-alpha spectral profiles.

  • 5 authors
·
Sep 2, 2025

Multi-fidelity climate model parameterization for better generalization and extrapolation

Machine-learning-based parameterizations (i.e. representation of sub-grid processes) of global climate models or turbulent simulations have recently been proposed as a powerful alternative to physical, but empirical, representations, offering a lower computational cost and higher accuracy. Yet, those approaches still suffer from a lack of generalization and extrapolation beyond the training data, which is however critical to projecting climate change or unobserved regimes of turbulence. Here we show that a multi-fidelity approach, which integrates datasets of different accuracy and abundance, can provide the best of both worlds: the capacity to extrapolate leveraging the physically-based parameterization and a higher accuracy using the machine-learning-based parameterizations. In an application to climate modeling, the multi-fidelity framework yields more accurate climate projections without requiring major increase in computational resources. Our multi-fidelity randomized prior networks (MF-RPNs) combine physical parameterization data as low-fidelity and storm-resolving historical run's data as high-fidelity. To extrapolate beyond the training data, the MF-RPNs are tested on high-fidelity warming scenarios, +4K, data. We show the MF-RPN's capacity to return much more skillful predictions compared to either low- or high-fidelity (historical data) simulations trained only on one regime while providing trustworthy uncertainty quantification across a wide range of scenarios. Our approach paves the way for the use of machine-learning based methods that can optimally leverage historical observations or high-fidelity simulations and extrapolate to unseen regimes such as climate change.

  • 4 authors
·
Sep 18, 2023

Implicit Neural Spatial Representations for Time-dependent PDEs

Implicit Neural Spatial Representation (INSR) has emerged as an effective representation of spatially-dependent vector fields. This work explores solving time-dependent PDEs with INSR. Classical PDE solvers introduce both temporal and spatial discretizations. Common spatial discretizations include meshes and meshless point clouds, where each degree-of-freedom corresponds to a location in space. While these explicit spatial correspondences are intuitive to model and understand, these representations are not necessarily optimal for accuracy, memory usage, or adaptivity. Keeping the classical temporal discretization unchanged (e.g., explicit/implicit Euler), we explore INSR as an alternative spatial discretization, where spatial information is implicitly stored in the neural network weights. The network weights then evolve over time via time integration. Our approach does not require any training data generated by existing solvers because our approach is the solver itself. We validate our approach on various PDEs with examples involving large elastic deformations, turbulent fluids, and multi-scale phenomena. While slower to compute than traditional representations, our approach exhibits higher accuracy and lower memory consumption. Whereas classical solvers can dynamically adapt their spatial representation only by resorting to complex remeshing algorithms, our INSR approach is intrinsically adaptive. By tapping into the rich literature of classic time integrators, e.g., operator-splitting schemes, our method enables challenging simulations in contact mechanics and turbulent flows where previous neural-physics approaches struggle. Videos and codes are available on the project page: http://www.cs.columbia.edu/cg/INSR-PDE/

  • 5 authors
·
Sep 30, 2022

Localized Heating and Dynamics of the Solar Corona due to a Symbiosis of Waves and Reconnection

The Sun's outer atmosphere, the corona, is maintained at mega-Kelvin temperatures and fills the heliosphere with a supersonic outflowing wind. The dissipation of magnetic waves and direct electric currents are likely to be the most significant processes for heating the corona, but a lively debate exists on their relative roles. Here, we suggest that the two are often intrinsically linked, since magnetic waves may trigger current dissipation, and impulsive reconnection can launch magnetic waves. We present a study of the first of these processes by using a 2D physics-based numerical simulation using the Adaptive Mesh Refined (AMR) Versatile Advection Code (VAC). Magnetic waves such as fast magnetoacoustic waves are often observed to propagate in the large-scale corona and interact with local magnetic structures. The present numerical simulations show how the propagation of magnetic disturbances towards a null point or separator can lead to the accumulation of the electric currents. Lorentz forces can laterally push and vertically stretch the magnetic fields, forming a current sheet with a strong magnetic-field gradient. The magnetic field lines then break and reconnect, and so contribute towards coronal heating. Numerical results are presented that support these ideas and support the concept of a symbiosis between waves and reconnection in heating the solar corona.

  • 9 authors
·
Mar 20, 2025

Lagrangian Coherent Track Initialisation (LCTI)

Advances in time-resolved Particle Tracking Velocimetry (4D-PTV) techniques have been consistently revealed more accurate Lagrangian particle motions. A novel track initialisation technique as a complementary part of 4D-PTV, based on local temporal and spatial coherency of neighbour trajectories, is proposed. The proposed Lagrangian Coherent Track Initialisation (LCTI) applies physics-based Finite Time Lyapunov Exponent (FTLE) to build four frame coherent tracks. We locally determine the boundaries (i.e., ridges) of Lagrangian Coherent Structures (LCS) among neighbour trajectories by using FTLE to distinguish clusters of coherent motions. To evaluate the proposed technique, we created an open-access synthetic Lagrangian and Eulerian dataset of the wake downstream of a smooth cylinder at a Reynolds number equal to 3900 obtained from 3D Direct Numerical Simulation (DNS). The dataset is available to the public. Performance of the proposed method based on three characteristic parameters, temporal scale, particle concentration (i.e., density), and noise ratio, showed robust behaviour in finding true tracks compared to the recent initialisation algorithms. Sensitivity of LCTI to the number of untracked and wrong tracks are also discussed. We address the capability of using the proposed method as a function of a 4D-PTV scheme in the Lagrangian Particle Tracking challenge for a flow with high particle densities. Finally, the LCTI behaviour was assessed in a real jet impingement experiment. LCTI was found to be a reliable tracking tool in complex flow motions, with a strength revealed for flows with high particle concentrations.

  • 4 authors
·
Jun 21, 2021

Lagrangian PINNs: A causality-conforming solution to failure modes of physics-informed neural networks

Physics-informed neural networks (PINNs) leverage neural-networks to find the solutions of partial differential equation (PDE)-constrained optimization problems with initial conditions and boundary conditions as soft constraints. These soft constraints are often considered to be the sources of the complexity in the training phase of PINNs. Here, we demonstrate that the challenge of training (i) persists even when the boundary conditions are strictly enforced, and (ii) is closely related to the Kolmogorov n-width associated with problems demonstrating transport, convection, traveling waves, or moving fronts. Given this realization, we describe the mechanism underlying the training schemes such as those used in eXtended PINNs (XPINN), curriculum regularization, and sequence-to-sequence learning. For an important category of PDEs, i.e., governed by non-linear convection-diffusion equation, we propose reformulating PINNs on a Lagrangian frame of reference, i.e., LPINNs, as a PDE-informed solution. A parallel architecture with two branches is proposed. One branch solves for the state variables on the characteristics, and the second branch solves for the low-dimensional characteristics curves. The proposed architecture conforms to the causality innate to the convection, and leverages the direction of travel of the information in the domain. Finally, we demonstrate that the loss landscapes of LPINNs are less sensitive to the so-called "complexity" of the problems, compared to those in the traditional PINNs in the Eulerian framework.

  • 3 authors
·
May 5, 2022

Extreme Event Prediction with Multi-agent Reinforcement Learning-based Parametrization of Atmospheric and Oceanic Turbulence

Global climate models (GCMs) are the main tools for understanding and predicting climate change. However, due to limited numerical resolutions, these models suffer from major structural uncertainties; e.g., they cannot resolve critical processes such as small-scale eddies in atmospheric and oceanic turbulence. Thus, such small-scale processes have to be represented as a function of the resolved scales via closures (parametrization). The accuracy of these closures is particularly important for capturing climate extremes. Traditionally, such closures are based on heuristics and simplifying assumptions about the unresolved physics. Recently, supervised-learned closures, trained offline on high-fidelity data, have been shown to outperform the classical physics-based closures. However, this approach requires a significant amount of high-fidelity training data and can also lead to instabilities. Reinforcement learning is emerging as a potent alternative for developing such closures as it requires only low-order statistics and leads to stable closures. In Scientific Multi-Agent Reinforcement Learning (SMARL) computational elements serve a dual role of discretization points and learning agents. We leverage SMARL and fundamentals of turbulence physics to learn closures for prototypes of atmospheric and oceanic turbulence. The policy is trained using only the enstrophy spectrum, which is nearly invariant and can be estimated from a few high-fidelity samples (these few samples are far from enough for supervised/offline learning). We show that these closures lead to stable low-resolution simulations that, at a fraction of the cost, can reproduce the high-fidelity simulations' statistics, including the tails of the probability density functions. The results demonstrate the high potential of SMARL for closure modeling for GCMs, especially in the regime of scarce data and indirect observations.

  • 5 authors
·
Dec 1, 2023

Fusion-DeepONet: A Data-Efficient Neural Operator for Geometry-Dependent Hypersonic and Supersonic Flows

Shape optimization is essential in aerospace vehicle design, including reentry systems, and propulsion system components, as it directly influences aerodynamic efficiency, structural integrity, and overall mission success. Rapid and accurate prediction of external and internal flows accelerates design iterations. To this end, we develop a new variant of DeepONet, called Fusion-DeepONet as a fast surrogate model for geometry-dependent hypersonic and supersonic flow fields. We evaluated Fusion-DeepONet in learning two external hypersonic flows and a supersonic shape-dependent internal flow problem. First, we compare the performance of Fusion-DeepONet with state-of-the-art neural operators to learn inviscid hypersonic flow around semi-elliptic blunt bodies for two grid types: uniform Cartesian and irregular grids. Fusion-DeepONet provides comparable accuracy to parameter-conditioned U-Net on uniform grids while outperforming MeshGraphNet and Vanilla-DeepONet on irregular grids. Fusion-DeepONet requires significantly fewer trainable parameters than U-Net, MeshGraphNet, and FNO. For the second hypersonic problem, we set up Fusion-DeepONet to map from geometry and free stream Mach number to the temperature field around a reentry capsule traveling at hypersonic speed. This fast surrogate is then improved to predict the spatial derivative of the temperature, resulting in an accurate prediction of heat flux at the surfaces of the capsule. To enhance the accuracy of spatial derivative prediction, we introduce a derivative-enhanced loss term with the least computation overhead. For the third problem, we show that Fusion-DeepONet outperforms MeshGraphNet in learning geometry-dependent supersonic flow in a converging-diverging nozzle configuration. For all the problems, we used high-fidelity simulations with a high-order entropy-stable DGSEM solver to generate training datasets with limited samples.

  • 3 authors
·
Jan 3, 2025

HiRO-ACE: Fast and skillful AI emulation and downscaling trained on a 3 km global storm-resolving model

Kilometer-scale simulations of the atmosphere are an important tool for assessing local weather extremes and climate impacts, but computational expense limits their use to small regions, short periods, and limited ensembles. Machine learning offers a pathway to efficiently emulate these high-resolution simulations. Here we introduce HiRO-ACE, a two-stage AI modeling framework combining a stochastic version of the Ai2 Climate Emulator (ACE2S) with diffusion-based downscaling (HiRO) to generate 3 km precipitation fields over arbitrary regions of the globe. Both components are trained on data derived from a decade of atmospheric simulation by X-SHiELD, a 3 km global storm-resolving model. HiRO performs a 32x downscaling--generating 3 km 6-hourly precipitation from coarse 100 km inputs by training on paired high-resolution and coarsened X-SHiELD outputs. ACE2S is a 1^circ times 1^circ (sim100 km) stochastic autoregressive global atmosphere emulator that maintains grid-scale precipitation variability consistent with coarsened X-SHiELD, enabling its outputs to be ingested by HiRO without additional tuning. HiRO-ACE reproduces the distribution of extreme precipitation rates through the 99.99th percentile, with time-mean precipitation biases below 10% almost everywhere. The framework generates plausible tropical cyclones, fronts, and convective events from poorly resolved coarse inputs. Its computational efficiency allows generation of 6-hourly high-resolution regional precipitation for decades of simulated climate within a single day using one H100 GPU, while the probabilistic design enables ensemble generation for quantifying uncertainty. This establishes an AI-enabled pathway for affordably leveraging the realism of expensive km-scale simulations to support local climate adaptation planning and extreme event risk assessment.

  • 8 authors
·
Dec 20, 2025

Prithvi WxC: Foundation Model for Weather and Climate

Triggered by the realization that AI emulators can rival the performance of traditional numerical weather prediction models running on HPC systems, there is now an increasing number of large AI models that address use cases such as forecasting, downscaling, or nowcasting. While the parallel developments in the AI literature focus on foundation models -- models that can be effectively tuned to address multiple, different use cases -- the developments on the weather and climate side largely focus on single-use cases with particular emphasis on mid-range forecasting. We close this gap by introducing Prithvi WxC, a 2.3 billion parameter foundation model developed using 160 variables from the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture, incorporating concepts from various recent transformer models to effectively capture both regional and global dependencies in the input data. The model has been designed to accommodate large token counts to model weather phenomena in different topologies at fine resolutions. Furthermore, it is trained with a mixed objective that combines the paradigms of masked reconstruction with forecasting. We test the model on a set of challenging downstream tasks namely: Autoregressive rollout forecasting, Downscaling, Gravity wave flux parameterization, and Extreme events estimation. The pretrained model with 2.3 billion parameters, along with the associated fine-tuning workflows, has been publicly released as an open-source contribution via Hugging Face.

  • 29 authors
·
Sep 20, 2024 4

Towards Universal Mesh Movement Networks

Solving complex Partial Differential Equations (PDEs) accurately and efficiently is an essential and challenging problem in all scientific and engineering disciplines. Mesh movement methods provide the capability to improve the accuracy of the numerical solution without increasing the overall mesh degree of freedom count. Conventional sophisticated mesh movement methods are extremely expensive and struggle to handle scenarios with complex boundary geometries. However, existing learning-based methods require re-training from scratch given a different PDE type or boundary geometry, which limits their applicability, and also often suffer from robustness issues in the form of inverted elements. In this paper, we introduce the Universal Mesh Movement Network (UM2N), which -- once trained -- can be applied in a non-intrusive, zero-shot manner to move meshes with different size distributions and structures, for solvers applicable to different PDE types and boundary geometries. UM2N consists of a Graph Transformer (GT) encoder for extracting features and a Graph Attention Network (GAT) based decoder for moving the mesh. We evaluate our method on advection and Navier-Stokes based examples, as well as a real-world tsunami simulation case. Our method outperforms existing learning-based mesh movement methods in terms of the benchmarks described above. In comparison to the conventional sophisticated Monge-Amp\`ere PDE-solver based method, our approach not only significantly accelerates mesh movement, but also proves effective in scenarios where the conventional method fails. Our project page is at https://erizmr.github.io/UM2N/.

  • 8 authors
·
Jun 29, 2024

Generative Nowcasting of Marine Fog Visibility in the Grand Banks area and Sable Island in Canada

This study presents the application of generative deep learning techniques to evaluate marine fog visibility nowcasting using the FATIMA (Fog and turbulence interactions in the marine atmosphere) campaign observations collected during July 2022 in the North Atlantic in the Grand Banks area and vicinity of Sable Island (SI), northeast of Canada. The measurements were collected using the Vaisala Forward Scatter Sensor model FD70 and Weather Transmitter model WXT50, and Gill R3A ultrasonic anemometer mounted on the Research Vessel Atlantic Condor. To perform nowcasting, the time series of fog visibility (Vis), wind speed, dew point depression, and relative humidity with respect to water were preprocessed to have lagged time step features. Generative nowcasting of Vis time series for lead times of 30 and 60 minutes were performed using conditional generative adversarial networks (cGAN) regression at visibility thresholds of Vis < 1 km and < 10 km. Extreme gradient boosting (XGBoost) was used as a baseline method for comparison against cGAN. At the 30 min lead time, Vis was best predicted with cGAN at Vis < 1 km (RMSE = 0.151 km) and with XGBoost at Vis < 10 km (RMSE = 2.821 km). At the 60 min lead time, Vis was best predicted with XGBoost at Vis < 1 km (RMSE = 0.167 km) and Vis < 10 km (RMSE = 3.508 km), but the cGAN RMSE was similar to XGBoost. Despite nowcasting Vis at 30 min being quite difficult, the ability of the cGAN model to track the variation in Vis at 1 km suggests that there is potential for generative analysis of marine fog visibility using observational meteorological parameters.

  • 7 authors
·
Feb 9, 2024

Spectral-Refiner: Fine-Tuning of Accurate Spatiotemporal Neural Operator for Turbulent Flows

Recent advancements in operator-type neural networks have shown promising results in approximating the solutions of spatiotemporal Partial Differential Equations (PDEs). However, these neural networks often entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines. In this paper, we propose a new Spatiotemporal Fourier Neural Operator (SFNO) that learns maps between Bochner spaces, and a new learning framework to address these issues. This new paradigm leverages wisdom from traditional numerical PDE theory and techniques to refine the pipeline of commonly adopted end-to-end neural operator training and evaluations. Specifically, in the learning problems for the turbulent flow modeling by the Navier-Stokes Equations (NSE), the proposed architecture initiates the training with a few epochs for SFNO, concluding with the freezing of most model parameters. Then, the last linear spectral convolution layer is fine-tuned without the frequency truncation. The optimization uses a negative Sobolev norm for the first time as the loss in operator learning, defined through a reliable functional-type a posteriori error estimator whose evaluation is almost exact thanks to the Parseval identity. This design allows the neural operators to effectively tackle low-frequency errors while the relief of the de-aliasing filter addresses high-frequency errors. Numerical experiments on commonly used benchmarks for the 2D NSE demonstrate significant improvements in both computational efficiency and accuracy, compared to end-to-end evaluation and traditional numerical PDE solvers.

  • 4 authors
·
May 27, 2024

DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning

Learning PDE dynamics with neural solvers can significantly improve wall-clock efficiency and accuracy compared with classical numerical solvers. In recent years, foundation models for PDEs have largely adopted multi-scale windowed self-attention, with the scOT backbone in Poseidon serving as a representative example. However, because of their locality, truly globally consistent spectral coupling can only be propagated gradually through deep stacking and window shifting. This weakens global coupling and leads to error accumulation and drift during closed-loop rollouts. To address this, we propose DRIFT-Net. It employs a dual-branch design comprising a spectral branch and an image branch. The spectral branch is responsible for capturing global, large-scale low-frequency information, whereas the image branch focuses on local details and nonstationary structures. Specifically, we first perform controlled, lightweight mixing within the low-frequency range. Then we fuse the spectral and image paths at each layer via bandwise weighting, which avoids the width inflation and training instability caused by naive concatenation. The fused result is transformed back into the spatial domain and added to the image branch, thereby preserving both global structure and high-frequency details across scales. Compared with strong attention-based baselines, DRIFT-Net achieves lower error and higher throughput with fewer parameters under identical training settings and budget. On Navier--Stokes benchmarks, the relative L_{1} error is reduced by 7\%--54\%, the parameter count decreases by about 15\%, and the throughput remains higher than scOT. Ablation studies and theoretical analyses further demonstrate the stability and effectiveness of this design. The code is available at https://github.com/cruiseresearchgroup/DRIFT-Net.

Fuxi-DA: A Generalized Deep Learning Data Assimilation Framework for Assimilating Satellite Observations

Data assimilation (DA), as an indispensable component within contemporary Numerical Weather Prediction (NWP) systems, plays a crucial role in generating the analysis that significantly impacts forecast performance. Nevertheless, the development of an efficient DA system poses significant challenges, particularly in establishing intricate relationships between the background data and the vast amount of multi-source observation data within limited time windows in operational settings. To address these challenges, researchers design complex pre-processing methods for each observation type, leveraging approximate modeling and the power of super-computing clusters to expedite solutions. The emergence of deep learning (DL) models has been a game-changer, offering unified multi-modal modeling, enhanced nonlinear representation capabilities, and superior parallelization. These advantages have spurred efforts to integrate DL models into various domains of weather modeling. Remarkably, DL models have shown promise in matching, even surpassing, the forecast accuracy of leading operational NWP models worldwide. This success motivates the exploration of DL-based DA frameworks tailored for weather forecasting models. In this study, we introduces FuxiDA, a generalized DL-based DA framework for assimilating satellite observations. By assimilating data from Advanced Geosynchronous Radiation Imager (AGRI) aboard Fengyun-4B, FuXi-DA consistently mitigates analysis errors and significantly improves forecast performance. Furthermore, through a series of single-observation experiments, Fuxi-DA has been validated against established atmospheric physics, demonstrating its consistency and reliability.

  • 6 authors
·
Apr 12, 2024

Evaluating Uncertainty Quantification approaches for Neural PDEs in scientific applications

The accessibility of spatially distributed data, enabled by affordable sensors, field, and numerical experiments, has facilitated the development of data-driven solutions for scientific problems, including climate change, weather prediction, and urban planning. Neural Partial Differential Equations (Neural PDEs), which combine deep learning (DL) techniques with domain expertise (e.g., governing equations) for parameterization, have proven to be effective in capturing valuable correlations within spatiotemporal datasets. However, sparse and noisy measurements coupled with modeling approximation introduce aleatoric and epistemic uncertainties. Therefore, quantifying uncertainties propagated from model inputs to outputs remains a challenge and an essential goal for establishing the trustworthiness of Neural PDEs. This work evaluates various Uncertainty Quantification (UQ) approaches for both Forward and Inverse Problems in scientific applications. Specifically, we investigate the effectiveness of Bayesian methods, such as Hamiltonian Monte Carlo (HMC) and Monte-Carlo Dropout (MCD), and a more conventional approach, Deep Ensembles (DE). To illustrate their performance, we take two canonical PDEs: Burger's equation and the Navier-Stokes equation. Our results indicate that Neural PDEs can effectively reconstruct flow systems and predict the associated unknown parameters. However, it is noteworthy that the results derived from Bayesian methods, based on our observations, tend to display a higher degree of certainty in their predictions as compared to those obtained using the DE. This elevated certainty in predictions suggests that Bayesian techniques might underestimate the true underlying uncertainty, thereby appearing more confident in their predictions than the DE approach.

FlamePINN-1D: Physics-informed neural networks to solve forward and inverse problems of 1D laminar flames

Given the existence of various forward and inverse problems in combustion studies and applications that necessitate distinct methods for resolution, a framework to solve them in a unified way is critically needed. A promising approach is the integration of machine learning methods with governing equations of combustion systems, which exhibits superior generality and few-shot learning ability compared to purely data-driven methods. In this work, the FlamePINN-1D framework is proposed to solve the forward and inverse problems of 1D laminar flames based on physics-informed neural networks. Three cases with increasing complexity have been tested: Case 1 are freely-propagating premixed (FPP) flames with simplified physical models, while Case 2 and Case 3 are FPP and counterflow premixed (CFP) flames with detailed models, respectively. For forward problems, FlamePINN-1D aims to solve the flame fields and infer the unknown eigenvalues (such as laminar flame speeds) under the constraints of governing equations and boundary conditions. For inverse problems, FlamePINN-1D aims to reconstruct the continuous fields and infer the unknown parameters (such as transport and chemical kinetics parameters) from noisy sparse observations of the flame. Our results strongly validate these capabilities of FlamePINN-1D across various flames and working conditions. Compared to traditional methods, FlamePINN-1D is differentiable and mesh-free, exhibits no discretization errors, and is easier to implement for inverse problems. The inverse problem results also indicate the possibility of optimizing chemical mechanisms from measurements of laboratory 1D flames. Furthermore, some proposed strategies, such as hard constraints and thin-layer normalization, are proven to be essential for the robust learning of FlamePINN-1D. The code for this paper is partially available at https://github.com/CAME-THU/FlamePINN-1D.

  • 6 authors
·
Jun 7, 2024

A Space-Time Transformer for Precipitation Forecasting

Meteorological agencies around the world rely on real-time flood guidance to issue live-saving advisories and warnings. For decades traditional numerical weather prediction (NWP) models have been state-of-the-art for precipitation forecasting. However, physically-parameterized models suffer from a few core limitations: first, solving PDEs to resolve atmospheric dynamics is computationally demanding, and second, these methods degrade in performance at nowcasting timescales (i.e., 0-4 hour lead-times). Motivated by these shortcomings, recent work proposes AI-weather prediction (AI-WP) alternatives that learn to emulate analysis data with neural networks. While these data-driven approaches have enjoyed enormous success across diverse spatial and temporal resolutions, applications of video-understanding architectures for weather forecasting remain underexplored. To address these gaps, we propose SaTformer: a video transformer built on full space-time attention that skillfully forecasts extreme precipitation from satellite radiances. Along with our novel architecture, we introduce techniques to tame long-tailed precipitation datasets. Namely, we reformulate precipitation regression into a classification problem, and employ a class-weighted loss to address label imbalances. Our model scored first place on the NeurIPS Weather4Cast 2025 Cumulative Rainfall challenge. Code and model weights are available: https://github.com/leharris3/satformer

  • 2 authors
·
Nov 14, 2025

Stochastic acceleration in arbitrary astrophysical environments

Turbulent magnetic fields are to some extent a universal feature in astrophysical phenomena. Charged particles that encounter these turbulence get on average accelerated according to the so-called second-order Fermi process. However, in most astrophysical environments there are additional competing processes, such as different kinds of first-order energy changes and particle escape, that effect the resulting momentum distribution of the particles. In this work we provide to our knowledge the first semi-analytical solution of the isotropic steady-state momentum diffusion equation including continuous and catastrophic momentum changes that can be applied to any arbitrary astrophysical system of interest. Here, we adopt that the assigned magnetic turbulence is constrained on a finite range and the particle flux vanishes beyond these boundaries. Consequently, we show that the so-called pile-up bump -- that has for some special cases long been established -- is a universal feature of stochastic acceleration that emerges around the momentum chi_{rm eq} where acceleration and continuous loss are in equilibrium if the particle's residence time in the system is sufficient at chi_{rm eq}. In general, the impact of continuous and catastrophic momentum changes plays a crucial role in the shape of the steady-state momentum distribution of the accelerated particles, where simplified unbroken power-law approximations are often not adequate.

  • 2 authors
·
Nov 22, 2024

PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers

Time-dependent partial differential equations (PDEs) are ubiquitous in science and engineering. Recently, mostly due to the high computational cost of traditional solution techniques, deep neural network based surrogates have gained increased interest. The practical utility of such neural PDE solvers relies on their ability to provide accurate, stable predictions over long time horizons, which is a notoriously hard problem. In this work, we present a large-scale analysis of common temporal rollout strategies, identifying the neglect of non-dominant spatial frequency information, often associated with high frequencies in PDE solutions, as the primary pitfall limiting stable, accurate rollout performance. Based on these insights, we draw inspiration from recent advances in diffusion models to introduce PDE-Refiner; a novel model class that enables more accurate modeling of all frequency components via a multistep refinement process. We validate PDE-Refiner on challenging benchmarks of complex fluid dynamics, demonstrating stable and accurate rollouts that consistently outperform state-of-the-art models, including neural, numerical, and hybrid neural-numerical architectures. We further demonstrate that PDE-Refiner greatly enhances data efficiency, since the denoising objective implicitly induces a novel form of spectral data augmentation. Finally, PDE-Refiner's connection to diffusion models enables an accurate and efficient assessment of the model's predictive uncertainty, allowing us to estimate when the surrogate becomes inaccurate.

  • 5 authors
·
Aug 10, 2023

AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning

The atmosphere affects humans in a multitude of ways, from loss of life due to adverse weather effects to long-term social and economic impacts on societies. Computer simulations of atmospheric dynamics are, therefore, of great importance for the well-being of our and future generations. Here, we propose AtmoRep, a novel, task-independent stochastic computer model of atmospheric dynamics that can provide skillful results for a wide range of applications. AtmoRep uses large-scale representation learning from artificial intelligence to determine a general description of the highly complex, stochastic dynamics of the atmosphere from the best available estimate of the system's historical trajectory as constrained by observations. This is enabled by a novel self-supervised learning objective and a unique ensemble that samples from the stochastic model with a variability informed by the one in the historical record. The task-independent nature of AtmoRep enables skillful results for a diverse set of applications without specifically training for them and we demonstrate this for nowcasting, temporal interpolation, model correction, and counterfactuals. We also show that AtmoRep can be improved with additional data, for example radar observations, and that it can be extended to tasks such as downscaling. Our work establishes that large-scale neural networks can provide skillful, task-independent models of atmospheric dynamics. With this, they provide a novel means to make the large record of atmospheric observations accessible for applications and for scientific inquiry, complementing existing simulations based on first principles.

  • 6 authors
·
Aug 25, 2023

HealDA: Highlighting the importance of initial errors in end-to-end AI weather forecasts

AI weather models now rival leading numerical weather prediction (NWP) systems in medium-range skill. However, almost all still rely on NWP data assimilation (DA) to provide initial conditions, tying them to expensive infrastructure and limiting the practical speed and accuracy gains of ML. More recently, ML-based DA systems have been proposed, which are often trained and evaluated end-to-end with a forecast model, making it difficult to assess the quality of their analysis fields. We introduce HealDA, a global ML-based DA system that maps a short window of satellite and conventional observations directly to a 1° atmospheric state on the HEALPix grid, using a smaller sensor suite than operational NWP and no background forecast at runtime. We treat HealDA strictly as a DA module: its analyses are used to initialize off-the-shelf ML forecast models without any fine-tuning of either. For a variety of off-the-shelf ML forecast models, including FourCastNet3 (FCN3), Aurora, and FengWu, HealDA-initialized forecasts lose less than one day of effective lead time when scored against ERA5. HealDA-initialized FCN3 ensembles similarly trail those of the ECMWF IFS ENS system by < 24 h. We find that forecast error growth in these models i unchanged from HealDA initialization, and the skill gap primarily arises from the larger initial error of the HealDA analysis. Spectral analysis reveals that this stems from overfitting to the large scales and upper-tropospheric fields. We also demonstrate that small changes in the verification setup can shift apparent skill by 12--24h, underscoring the need for consistent scoring. Taken together, these results clarify the current performance of ML-based DA systems and show that a relatively simple, background-free network can already provide initial conditions that are usable by state-of-the-art ML forecast models with only modest loss in medium-range skill.

  • 9 authors
·
Jan 24

Aardvark weather: end-to-end data-driven weather forecasting

Weather forecasting is critical for a range of human activities including transportation, agriculture, industry, as well as the safety of the general public. Machine learning models have the potential to transform the complex weather prediction pipeline, but current approaches still rely on numerical weather prediction (NWP) systems, limiting forecast speed and accuracy. Here we demonstrate that a machine learning model can replace the entire operational NWP pipeline. Aardvark Weather, an end-to-end data-driven weather prediction system, ingests raw observations and outputs global gridded forecasts and local station forecasts. Further, it can be optimised end-to-end to maximise performance over quantities of interest. Global forecasts outperform an operational NWP baseline for multiple variables and lead times. Local station forecasts are skillful up to ten days lead time and achieve comparable and often lower errors than a post-processed global NWP baseline and a state-of-the-art end-to-end forecasting system with input from human forecasters. These forecasts are produced with a remarkably simple neural process model using just 8% of the input data and three orders of magnitude less compute than existing NWP and hybrid AI-NWP methods. We anticipate that Aardvark Weather will be the starting point for a new generation of end-to-end machine learning models for medium-range forecasting that will reduce computational costs by orders of magnitude and enable the rapid and cheap creation of bespoke models for users in a variety of fields, including for the developing world where state-of-the-art local models are not currently available.

  • 11 authors
·
Mar 30, 2024

Turbulence modulation in liquid-liquid two-phase Taylor-Couette turbulence

We investigate the coupling effects of the two-phase interface, viscosity ratio, and density ratio of the dispersed phase to the continuous phase on the flow statistics in two-phase Taylor-Couette turbulence at a system Reynolds number of 6000 and a system Weber number of 10 using interface-resolved three-dimensional direct numerical simulations with the volume-of-fluid method. Our study focuses on four different scenarios: neutral droplets, low-viscosity droplets, light droplets, and low-viscosity light droplets. We find that neutral droplets and low-viscosity droplets primarily contribute to drag enhancement through the two-phase interface, while light droplets reduce the system's drag by explicitly reducing Reynolds stress due to the density dependence of Reynolds stress. Additionally, low-viscosity light droplets contribute to greater drag reduction by further reducing momentum transport near the inner cylinder and implicitly reducing Reynolds stress. While interfacial tension enhances turbulent kinetic energy (TKE) transport, drag enhancement is not strongly correlated with TKE transport for both neutral droplets and low-viscosity droplets. Light droplets primarily reduce the production term by diminishing Reynolds stress, whereas the density contrast between the phases boosts TKE transport near the inner wall. Therefore, the reduction in the dissipation rate is predominantly attributed to decreased turbulence production, causing drag reduction. For low-viscosity light droplets, the production term diminishes further, primarily due to their greater reduction in Reynolds stress, while reduced viscosity weakens the density difference's contribution to TKE transport near the inner cylinder, resulting in a more pronounced reduction in the dissipation rate and consequently stronger drag reduction. Our findings provide new insights into the turbulence modulation in two-phase flow.

  • 6 authors
·
Jul 1, 2024

Radiation-magnetohydrodynamics with MPI-AMRVAC using flux-limited diffusion

Context. Radiation plays a significant role in solar and astrophysical environments as it may constitute a sizeable fraction of the energy density, momentum flux, and the total pressure. Modelling the dynamic interaction between radiation and magnetized plasmas in such environments is an intricate and computationally costly task. Aims. The goal of this work is to demonstrate the capabilities of the open-source parallel, block-adaptive computational framework MPI-AMRVAC, in solving equations of radiation-magnetohydrodynamics (RMHD), and to present benchmark test cases relevant for radiation-dominated magnetized plasmas. Methods. The existing magnetohydrodynamics (MHD) and flux-limited diffusion (FLD) radiative-hydrodynamics physics modules are combined to solve the equations of radiation-magnetohydrodynamics (RMHD) on block-adaptive finite volume Cartesian meshes in any dimensionality. Results. We introduce and validate several benchmark test cases such as steady radiative MHD shocks, radiation-damped linear MHD waves, radiation-modified Riemann problems and a multi-dimensional radiative magnetoconvection case. We recall the basic governing Rankine-Hugoniot relations for shocks and the dispersion relation for linear MHD waves in the presence of optically thick radiation fields where the diffusion limit is reached. The RMHD system allows for 8 linear wave types, where the classical 7-wave MHD picture (entropy and three wave pairs for slow, Alfven and fast) is augmented with a radiative diffusion mode. Conclusions. The MPI-AMRVAC code now has the capability to perform multidimensional RMHD simulations with mesh adaptation making it well-suited for larger scientific applications to study magnetized matter-radiation interactions in solar and stellar interiors and atmospheres.

  • 5 authors
·
Mar 4, 2025

Training Transformers for Mesh-Based Simulations

Simulating physics using Graph Neural Networks (GNNs) is predominantly driven by message-passing architectures, which face challenges in scaling and efficiency, particularly in handling large, complex meshes. These architectures have inspired numerous enhancements, including multigrid approaches and K-hop aggregation (using neighbours of distance K), yet they often introduce significant complexity and suffer from limited in-depth investigations. In response to these challenges, we propose a novel Graph Transformer architecture that leverages the adjacency matrix as an attention mask. The proposed approach incorporates innovative augmentations, including Dilated Sliding Windows and Global Attention, to extend receptive fields without sacrificing computational efficiency. Through extensive experimentation, we evaluate model size, adjacency matrix augmentations, positional encoding and K-hop configurations using challenging 3D computational fluid dynamics (CFD) datasets. We also train over 60 models to find a scaling law between training FLOPs and parameters. The introduced models demonstrate remarkable scalability, performing on meshes with up to 300k nodes and 3 million edges. Notably, the smallest model achieves parity with MeshGraphNet while being 7times faster and 6times smaller. The largest model surpasses the previous state-of-the-art by 38.8\% on average and outperforms MeshGraphNet by 52\% on the all-rollout RMSE, while having a similar training speed. Code and datasets are available at https://github.com/DonsetPG/graph-physics.

  • 4 authors
·
Aug 25, 2025

An Old-Fashioned Framework for Machine Learning in Turbulence Modeling

The objective is to provide clear and well-motivated guidance to Machine Learning (ML) teams, founded on our experience in empirical turbulence modeling. Guidance is also needed for modeling outside ML. ML is not yet successful in turbulence modeling, and many papers have produced unusable proposals either due to errors in math or physics, or to severe overfitting. We believe that "Turbulence Culture" (TC) takes years to learn and is difficult to convey especially considering the modern lack of time for careful study; important facts which are self-evident after a career in turbulence research and modeling and extensive reading are easy to miss. In addition, many of them are not absolute facts, a consequence of the gaps in our understanding of turbulence and the weak connection of models to first principles. Some of the mathematical facts are rigorous, but the physical aspects often are not. Turbulence models are surprisingly arbitrary. Disagreement between experts confuses the new entrants. In addition, several key properties of the models are ascertained through non-trivial analytical properties of the differential equations, which puts them out of reach of purely data-driven ML-type approaches. The best example is the crucial behavior of the model at the edge of the turbulent region (ETR). The knowledge we wish to put out here may be divided into "Mission" and "Requirements," each combining physics and mathematics. Clear lists of "Hard" and "Soft" constraints are presented. A concrete example of how DNS data could be used, possibly allied with ML, is first carried through and illustrates the large number of decisions needed. Our focus is on creating effective products which will empower CFD, rather than on publications.

  • 1 authors
·
Aug 1, 2023

Climate Modelling in Low-Precision: Effects of both Deterministic & Stochastic Rounding

Motivated by recent advances in operational weather forecasting, we study the efficacy of low-precision arithmetic for climate simulations. We develop a framework to measure rounding error in a climate model which provides a stress-test for a low-precision version of the model, and we apply our method to a variety of models including the Lorenz system; a shallow water approximation for flow over a ridge; and a coarse resolution global atmospheric model with simplified parameterisations (SPEEDY). Although double precision (52 significant bits) is standard across operational climate models, in our experiments we find that single precision (23 sbits) is more than enough and that as low as half precision (10 sbits) is often sufficient. For example, SPEEDY can be run with 12 sbits across the entire code with negligible rounding error and this can be lowered to 10 sbits if very minor errors are accepted, amounting to less than 0.1 mm/6hr for the average grid-point precipitation, for example. Our test is based on the Wasserstein metric and this provides stringent non-parametric bounds on rounding error accounting for annual means as well as extreme weather events. In addition, by testing models using both round-to-nearest (RN) and stochastic rounding (SR) we find that SR can mitigate rounding error across a range of applications. Thus our results also provide evidence that SR could be relevant to next-generation climate models. While many studies have shown that low-precision arithmetic can be suitable on short-term weather forecasting timescales, our results give the first evidence that a similar low precision level can be suitable for climate.

  • 5 authors
·
Apr 30, 2021

Balancing Computational Efficiency and Forecast Error in Machine Learning-based Time-Series Forecasting: Insights from Live Experiments on Meteorological Nowcasting

Machine learning for time-series forecasting remains a key area of research. Despite successful application of many machine learning techniques, relating computational efficiency to forecast error remains an under-explored domain. This paper addresses this topic through a series of real-time experiments to quantify the relationship between computational cost and forecast error using meteorological nowcasting as an example use-case. We employ a variety of popular regression techniques (XGBoost, FC-MLP, Transformer, and LSTM) for multi-horizon, short-term forecasting of three variables (temperature, wind speed, and cloud cover) for multiple locations. During a 5-day live experiment, 4000 data sources were streamed for training and inferencing 144 models per hour. These models were parameterized to explore forecast error for two computational cost minimization methods: a novel auto-adaptive data reduction technique (Variance Horizon) and a performance-based concept drift-detection mechanism. Forecast error of all model variations were benchmarked in real-time against a state-of-the-art numerical weather prediction model. Performance was assessed using classical and novel evaluation metrics. Results indicate that using the Variance Horizon reduced computational usage by more than 50\%, while increasing between 0-15\% in error. Meanwhile, performance-based retraining reduced computational usage by up to 90\% while also improving forecast error by up to 10\%. Finally, the combination of both the Variance Horizon and performance-based retraining outperformed other model configurations by up to 99.7\% when considering error normalized to computational usage.

  • 5 authors
·
Sep 26, 2023

FISC: A Fluid-Inspired Framework for Decentralized and Scalable Swarm Control

Achieving scalable coordination in large robotic swarms is often constrained by reliance on inter-agent communication, which introduces latency, bandwidth limitations, and vulnerability to failure. To address this gap, a decentralized approach for outer-loop control of large multi-agent systems based on the paradigm of how a fluid moves through a volume is proposed and evaluated. A relationship between fundamental fluidic element properties and individual robotic agent states is developed such that the corresponding swarm "flows" through a space, akin to a fluid when forced via a pressure boundary condition. By ascribing fluid-like properties to subsets of agents, the swarm evolves collectively while maintaining desirable structure and coherence without explicit communication of agent states within or outside of the swarm. The approach is evaluated using simulations involving O(10^3) quadcopter agents and compared against Computational Fluid Dynamics (CFD) solutions for a converging-diverging domain. Quantitative agreement between swarm-derived and CFD fields is assessed using Root-Mean-Square Error (RMSE), yielding normalized errors of 0.15-0.9 for velocity, 0.61-0.98 for density, 0-0.937 for pressure. These results demonstrate the feasibility of treating large robotic swarms as continuum systems that retain the macroscopic structure derived from first principles, providing a basis for scalable and decentralized control.

  • 3 authors
·
Jan 30