File size: 4,848 Bytes
4862007 633adcd 4862007 633adcd 4862007 c04da5a 4862007 08fd201 4862007 633adcd 4862007 633adcd 4862007 633adcd 4862007 633adcd 4862007 633adcd 4862007 633adcd 08fd201 633adcd | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | ---
license: mit
pipeline_tag: graph-ml
---
<h1 align="center" style="font-size: 24px;">EquiformerV3:<br>Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers</h1>
<p align="center">
<a href="https://github.com/atomicarchitects/equiformer_v3" style="color: #1a73e8; font-weight: bold; font-size: 20px;">Code</a> |
<a href="https://arxiv.org/abs/2604.09130" style="color: #1a73e8; font-weight: bold; font-size: 20px;">Paper</a>
</p>
This repository contains the checkpoints for **EquiformerV3**, the third generation of the SE(3)-equivariant graph attention Transformer.
EquiformerV3 is designed to advance efficiency, expressivity, and generality in 3D atomistic modeling.
Building on [EquiformerV2](https://arxiv.org/abs/2306.12059), this version introduces (1) software optimizations,
(2) simple and effective modifications like equivariant merged layer normalization and attention with smooth cutoff, and
(3) SwiGLU-S^2 activations, which incorporate many-body interactions and preserve strict equivariance.
EquiformerV3 achieves state-of-the-art results on benchmarks including OC20, OMat24, and Matbench Discovery.
Please refer to the [official GitHub repository](https://github.com/atomicarchitects/equiformer_v3) for detailed instructions on environment setup and usage.
<p align="center">
<img width="50%" height="50%" src="https://cdn-uploads.huggingface.co/production/uploads/64948a4a8d5ff0dd776655fe/03TPndezDyUw4FcfTBk4n.png">
</p>
## Checkpoints
### MPtrj
<table>
<tr style="background-color: #f0f0f0;">
<td><strong>Model</strong></td>
<td><strong>Training data</strong></td>
<td><strong>Checkpoint</strong></td>
</tr>
<tr>
<td>EquiformerV3</td>
<td>MPtrj</td>
<td>
<a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/mptrj_gradient.pt">
mptrj_gradient.pt
</a>
</td>
</tr>
</table>
### OMat24 → MPtrj and sAlex
Training consists of (1) direct pre-training on OMat24, (2) gradient fine-tuning on OMat24 initialized from (1), and (3) gradient fine-tuning on MPtrj and sAlex initialized from (2).
<table>
<tr style="background-color: #f0f0f0;">
<td><strong>Model</strong></td>
<td><strong>Training data</strong></td>
<td><strong>Config</strong></td>
<td><strong>Checkpoint</strong></td>
</tr>
<tr>
<td>EquiformerV3 (direct pre-training)</td>
<td>OMat24</td>
<td>
<a href="https://github.com/atomicarchitects/equiformer_v3/blob/main/experimental/configs/omat24/omat24/experiments/direct/equiformer_v3_N%407_L%404_attn-hidden%4032_rbf%4064_max-neighbors%40300_attn-grid%4014-8_ffn-grid%4014_use-gate-force-head_merge-layer-norm_epochs%404-bs%40512-wd%401e-3-beta2%400.98-eps%401e-6_dens-p%400.5-std%400.025-r%400.5-0.75-w%401-no-stress-max-f%402.5_no-amp.yml">
omat24_direct.yml
</a>
</td>
<td>
<a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/omat24_direct.pt">
omat24_direct.pt
</a>
</td>
</tr>
<tr>
<td>EquiformerV3 (gradient fine-tuning)</td>
<td>OMat24</td>
<td>
<a href="https://github.com/atomicarchitects/equiformer_v3/blob/main/experimental/configs/omat24/omat24/experiments/gradient/equiformer_v3_grad-finetune_N%407_L%404_attn-hidden%4032_rbf%4064_max-neighbors%40300_attn-grid%4014-8_ffn-grid%4014_merge-layer-norm_lr%400-1e-4-epochs%402-bs%40512-wd%401e-3-beta2%400.98-eps%401e-6_pt-reg-dens-ft-no-reg.yml">
omat24_gradient.yml
</a>
</td>
<td>
<a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/omat24_gradient.pt">
omat24_gradient.pt
</a>
</td>
</tr>
<tr>
<td>EquiformerV3 (gradient fine-tuning)</td>
<td>MPtrj and sAlex</td>
<td>
<a href="https://github.com/atomicarchitects/equiformer_v3/blob/main/experimental/configs/omat24/salex_mptrj/experiments/gradient/equiformer_v3_grad-finetune_N%407_L%404_attn-hidden%4032_rbf%4064_max-neighbors%40300_attn-grid%4014-8_ffn-grid%4014_attn-eps%401e-8_lr%400-5e-5-warmup%400.1-epochs%402-mptrj-salex-ratio%408-bs%40256-wd%401e-3-beta2%400.98-eps%401e-6_pt-reg-dens-ft-no-reg-lr%401e-4.yml">
mptrj-salex_gradient.yml
</a>
</td>
<td>
<a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/omat24-mptrj-salex_gradient.pt">
omat24-mptrj-salex_gradient.pt
</a>
</td>
</tr>
</table>
## Citation
Please consider citing this work below if it is helpful:
```bibtex
@article{equiformer_v3,
title={EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers},
author={Yi-Lun Liao and Alexander J. Hoffman and Sabrina C. Shen and Alexandre Duval and Sam Walton Norwood and Tess Smidt},
journal={arXiv preprint arXiv:2604.09130},
year={2026}
}
``` |