Graph Machine Learning
File size: 4,848 Bytes
4862007
 
633adcd
4862007
 
 
 
633adcd
4862007
c04da5a
4862007
 
08fd201
 
 
 
 
 
4862007
633adcd
4862007
633adcd
 
 
4862007
633adcd
4862007
633adcd
4862007
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
633adcd
 
4862007
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
633adcd
 
 
 
08fd201
633adcd
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
license: mit
pipeline_tag: graph-ml
---

<h1 align="center" style="font-size: 24px;">EquiformerV3:<br>Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers</h1> 

<p align="center">
<a href="https://github.com/atomicarchitects/equiformer_v3" style="color: #1a73e8; font-weight: bold; font-size: 20px;">Code</a> | 
<a href="https://arxiv.org/abs/2604.09130" style="color: #1a73e8; font-weight: bold; font-size: 20px;">Paper</a>
</p>

This repository contains the checkpoints for **EquiformerV3**, the third generation of the SE(3)-equivariant graph attention Transformer. 
EquiformerV3 is designed to advance efficiency, expressivity, and generality in 3D atomistic modeling.
Building on [EquiformerV2](https://arxiv.org/abs/2306.12059), this version introduces (1) software optimizations, 
(2) simple and effective modifications like equivariant merged layer normalization and attention with smooth cutoff, and 
(3) SwiGLU-S^2 activations, which incorporate many-body interactions and preserve strict equivariance. 
EquiformerV3 achieves state-of-the-art results on benchmarks including OC20, OMat24, and Matbench Discovery.

Please refer to the [official GitHub repository](https://github.com/atomicarchitects/equiformer_v3) for detailed instructions on environment setup and usage.

<p align="center">
  <img width="50%" height="50%" src="https://cdn-uploads.huggingface.co/production/uploads/64948a4a8d5ff0dd776655fe/03TPndezDyUw4FcfTBk4n.png">
</p>

## Checkpoints

### MPtrj
<table>
  <tr style="background-color: #f0f0f0;">
   <td><strong>Model</strong></td>
   <td><strong>Training data</strong></td>
   <td><strong>Checkpoint</strong></td>
  </tr>
  <tr>
   <td>EquiformerV3</td>
   <td>MPtrj</td>
   <td>
     <a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/mptrj_gradient.pt">
       mptrj_gradient.pt
     </a>
   </td>
  </tr>
</table>

### OMat24 → MPtrj and sAlex
Training consists of (1) direct pre-training on OMat24, (2) gradient fine-tuning on OMat24 initialized from (1), and (3) gradient fine-tuning on MPtrj and sAlex initialized from (2).

<table>
  <tr style="background-color: #f0f0f0;">
   <td><strong>Model</strong></td>
   <td><strong>Training data</strong></td>
   <td><strong>Config</strong></td>
   <td><strong>Checkpoint</strong></td>
  </tr>
  <tr>
   <td>EquiformerV3 (direct pre-training)</td>
   <td>OMat24</td>
   <td>
     <a href="https://github.com/atomicarchitects/equiformer_v3/blob/main/experimental/configs/omat24/omat24/experiments/direct/equiformer_v3_N%407_L%404_attn-hidden%4032_rbf%4064_max-neighbors%40300_attn-grid%4014-8_ffn-grid%4014_use-gate-force-head_merge-layer-norm_epochs%404-bs%40512-wd%401e-3-beta2%400.98-eps%401e-6_dens-p%400.5-std%400.025-r%400.5-0.75-w%401-no-stress-max-f%402.5_no-amp.yml">
       omat24_direct.yml
     </a>
   </td>
   <td>
     <a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/omat24_direct.pt">
       omat24_direct.pt
     </a>
   </td>
  </tr>
  <tr>
   <td>EquiformerV3 (gradient fine-tuning)</td>
   <td>OMat24</td>
   <td>
     <a href="https://github.com/atomicarchitects/equiformer_v3/blob/main/experimental/configs/omat24/omat24/experiments/gradient/equiformer_v3_grad-finetune_N%407_L%404_attn-hidden%4032_rbf%4064_max-neighbors%40300_attn-grid%4014-8_ffn-grid%4014_merge-layer-norm_lr%400-1e-4-epochs%402-bs%40512-wd%401e-3-beta2%400.98-eps%401e-6_pt-reg-dens-ft-no-reg.yml">
       omat24_gradient.yml
     </a>
   </td>
   <td>
     <a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/omat24_gradient.pt">
       omat24_gradient.pt
     </a>
   </td>
  </tr>
  <tr>
   <td>EquiformerV3 (gradient fine-tuning)</td>
   <td>MPtrj and sAlex</td>
   <td>
     <a href="https://github.com/atomicarchitects/equiformer_v3/blob/main/experimental/configs/omat24/salex_mptrj/experiments/gradient/equiformer_v3_grad-finetune_N%407_L%404_attn-hidden%4032_rbf%4064_max-neighbors%40300_attn-grid%4014-8_ffn-grid%4014_attn-eps%401e-8_lr%400-5e-5-warmup%400.1-epochs%402-mptrj-salex-ratio%408-bs%40256-wd%401e-3-beta2%400.98-eps%401e-6_pt-reg-dens-ft-no-reg-lr%401e-4.yml">
       mptrj-salex_gradient.yml
     </a>
   </td>
   <td>
     <a href="https://huggingface.co/yilunliao/equiformer_v3/blob/main/checkpoint/omat24-mptrj-salex_gradient.pt">
       omat24-mptrj-salex_gradient.pt
     </a>
   </td>
  </tr>
</table>

## Citation

Please consider citing this work below if it is helpful:

```bibtex
@article{equiformer_v3,
    title={EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers}, 
    author={Yi-Lun Liao and Alexander J. Hoffman and Sabrina C. Shen and Alexandre Duval and Sam Walton Norwood and Tess Smidt},
    journal={arXiv preprint arXiv:2604.09130},
    year={2026}
}
```