Papers
arxiv:2509.10872

Reactive Chemistry at Unrestricted Coupled Cluster Level: High-throughput Calculations for Training Machine Learning Potentials

Published on Sep 13, 2025
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

Unrestricted coupled cluster calculations enabled the creation of a large dataset for training machine learning interatomic potentials, demonstrating improved force accuracy and activation energy reproduction compared to traditional DFT-based approaches.

Accurately modeling chemical reactions at the atomistic level requires high-level electronic structure theory due to the presence of unpaired electrons and the need to properly describe bond breaking and making energetics. Commonly used approaches such as Density Functional Theory (DFT) frequently fail for this task due to deficiencies that are well recognized. However, for high-fidelity approaches, creating large datasets of energies and forces for reactive processes to train machine learning interatomic potentials or force fields is daunting. For example, the use of the unrestricted coupled cluster level of theory has previously been seen as unfeasible due to high computational costs, the lack of analytical gradients in many computational codes, and additional challenges such as constructing suitable basis set corrections for forces. In this work, we develop new methods and workflows to overcome the challenges inherent to automating unrestricted coupled cluster calculations. Using these advancements, we create a dataset of gas-phase reactions containing energies and forces for 3119 different organic molecules configurations calculated at the gold-standard level of unrestricted CCSD(T) (coupled cluster singles doubles and perturbative triples). With this dataset, we provide an analysis of the differences between the density functional and unrestricted CCSD(T) descriptions. We develop a transferable machine learning interatomic potential for gas-phase reactions, trained on unrestricted CCSD(T) data, and demonstrate the advantages of transitioning away from DFT data. Transitioning from training to DFT to training to UCCSD(T) datasets yields an improvement of more than 0.1 eV/Å in force accuracy and over 0.1 eV in activation energy reproduction.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2509.10872
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.10872 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.10872 in a Space README.md to link it from this page.

Collections including this paper 1