File size: 5,061 Bytes
8a10305 d3f1f7f 8a10305 cb91d39 8a10305 46fdeba 8a10305 198ddf9 46fdeba 198ddf9 46fdeba 8a10305 4ae763e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | ---
license: bsd-3-clause
language:
- en
tags:
- pytorch
- materials-science
- crystallography
- x-ray-diffraction
- pxrd
- convnext
- arxiv:2603.23367
datasets:
- materials-project
metrics:
- accuracy
- mae
pipeline_tag: other
---
# Open AlphaDiffract
[arXiv](https://arxiv.org/abs/2603.23367) | [GitHub](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract)
**Automated crystallographic analysis of powder X-ray diffraction data.**
AlphaDiffract is a multi-task 1D ConvNeXt model that takes a powder X-ray diffraction (PXRD) pattern and simultaneously predicts:
| Output | Description |
|---|---|
| **Crystal system** | 7-class classification (Triclinic β Cubic) |
| **Space group** | 230-class classification |
| **Lattice parameters** | 6 values: a, b, c (Γ
), Ξ±, Ξ², Ξ³ (Β°) |
This release contains a **single model** trained exclusively on
[Materials Project](https://next-gen.materialsproject.org/) structures
(publicly available data). It is *not* the 10-model ensemble reported in
the paper β see [Performance](#performance) for details.
## Quick Start
```bash
pip install torch safetensors
huggingface-cli download linked-liszt/OpenAlphaDiffract --local-dir OpenAlphaDiffract
```
```python
import torch
import numpy as np
from OpenAlphaDiffract.model import AlphaDiffract
model = AlphaDiffract.from_pretrained("OpenAlphaDiffract", device="cpu")
# 8192-point intensity pattern, normalized to [0, 100]
pattern = np.load("my_pattern.npy").astype(np.float32)
pattern = (pattern - pattern.min()) / (pattern.max() - pattern.min() + 1e-10) * 100.0
x = torch.from_numpy(pattern).unsqueeze(0)
with torch.no_grad():
out = model(x)
cs_probs = torch.softmax(out["cs_logits"], dim=-1)
sg_probs = torch.softmax(out["sg_logits"], dim=-1)
lp = out["lp"] # [a, b, c, alpha, beta, gamma]
print("Crystal system:", AlphaDiffract.CRYSTAL_SYSTEMS[cs_probs.argmax().item()])
print("Space group: #", sg_probs.argmax().item() + 1)
print("Lattice params:", lp[0].tolist())
```
See `example_inference.py` for a complete runnable example.
## Files
| File | Description |
|---|---|
| `model.safetensors` | Model weights (safetensors format, ~35 MB) |
| `model.py` | Standalone model definition (pure PyTorch, no Lightning) |
| `config.json` | Architecture and training hyperparameters |
| `maxsub.json` | Space-group subgroup graph (230Γ230, used as a registered buffer) |
| `example_inference.py` | End-to-end inference example |
| `LICENSE` | BSD 3-Clause |
## Input Format
- **Length:** 8192 equally-spaced intensity values
- **2ΞΈ range:** 5β20Β° (monochromatic, 20 keV)
- **Preprocessing:** floor negatives at zero, then rescale to [0, 100]
- **Shape:** `(batch, 8192)` or `(batch, 1, 8192)`
## Architecture
1D ConvNeXt backbone adapted from [Liu et al. (2022)](https://arxiv.org/abs/2201.03545):
```
Input (8192) β [ConvNeXt Block Γ 3 with AvgPool] β Flatten (560-d)
ββ CS head: MLP 560β2300β1150β7 (crystal system)
ββ SG head: MLP 560β2300β1150β230 (space group)
ββ LP head: MLP 560β512β256β6 (lattice parameters, sigmoid-bounded)
```
- **Parameters:** 8,734,989
- **Activation:** GELU
- **Stochastic depth:** 0.3
- **Head dropout:** 0.5
## Performance
This is a **single model** trained on Materials Project data only (no ICSD).
Metrics on the best validation checkpoint (epoch 11):
| Metric | Simulated Val | RRUFF (experimental) |
|---|---|---|
| Crystal system accuracy | 74.88% | 60.35% |
| Space group accuracy | 57.31% | 38.28% |
| Lattice parameter MAE | 2.71 | β |
The paper reports higher numbers from a 10-model ensemble trained on
Materials Project + ICSD combined data. This open-weights release covers
only publicly available training data.
## Training Details
| | |
|---|---|
| **Data** | ~146k Materials Project structures, 100 GSAS-II simulations each |
| **Augmentation** | Poisson + Gaussian noise, rescaled to [0, 100] |
| **Optimizer** | AdamW (lr=2e-4, weight_decay=0.01) |
| **Scheduler** | CyclicLR (triangular2, 6-epoch half-cycles) |
| **Loss** | CE (crystal system) + CE + GEMD (space group) + MSE (lattice params) |
| **Hardware** | 1Γ NVIDIA H100, float32 |
| **Batch size** | 64 |
## Citation
```bibtex
@article{andrejevic2026alphadiffract,
title = {AlphaDiffract: Automated Crystallographic Analysis of Powder X-ray Diffraction Data},
author = {Andrejevic, Nina and Du, Ming and Sharma, Hemant and Horwath, James P. and Luo, Aileen and Yin, Xiangyu and Prince, Michael and Toby, Brian H. and Cherukara, Mathew J.},
year = {2026},
eprint = {2603.23367},
archivePrefix = {arXiv},
primaryClass = {cond-mat.mtrl-sci},
doi = {10.48550/arXiv.2603.23367},
url = {https://arxiv.org/abs/2603.23367}
}
```
## License
BSD 3-Clause β Copyright 2026 UChicago Argonne, LLC.
## Links
- [arXiv: 2603.23367](https://arxiv.org/abs/2603.23367)
- [GitHub: OpenAlphaDiffract](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract)
- [GitHub: AlphaDiffract](https://github.com/AdvancedPhotonSource/AlphaDiffract)
|