| --- |
| license: bsd-3-clause |
| language: |
| - en |
| tags: |
| - pytorch |
| - materials-science |
| - crystallography |
| - x-ray-diffraction |
| - pxrd |
| - convnext |
| - arxiv:2603.23367 |
| datasets: |
| - materials-project |
| metrics: |
| - accuracy |
| - mae |
| pipeline_tag: other |
| --- |
| |
| # Open AlphaDiffract |
|
|
| [arXiv](https://arxiv.org/abs/2603.23367) | [GitHub](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract) |
|
|
| **Automated crystallographic analysis of powder X-ray diffraction data.** |
|
|
| AlphaDiffract is a multi-task 1D ConvNeXt model that takes a powder X-ray diffraction (PXRD) pattern and simultaneously predicts: |
|
|
| | Output | Description | |
| |---|---| |
| | **Crystal system** | 7-class classification (Triclinic β Cubic) | |
| | **Space group** | 230-class classification | |
| | **Lattice parameters** | 6 values: a, b, c (Γ
), Ξ±, Ξ², Ξ³ (Β°) | |
|
|
| This release contains a **single model** trained exclusively on |
| [Materials Project](https://next-gen.materialsproject.org/) structures |
| (publicly available data). It is *not* the 10-model ensemble reported in |
| the paper β see [Performance](#performance) for details. |
|
|
| ## Quick Start |
|
|
| ```bash |
| pip install torch safetensors |
| huggingface-cli download linked-liszt/OpenAlphaDiffract --local-dir OpenAlphaDiffract |
| ``` |
|
|
| ```python |
| import torch |
| import numpy as np |
| from OpenAlphaDiffract.model import AlphaDiffract |
| |
| model = AlphaDiffract.from_pretrained("OpenAlphaDiffract", device="cpu") |
| |
| # 8192-point intensity pattern, normalized to [0, 100] |
| pattern = np.load("my_pattern.npy").astype(np.float32) |
| pattern = (pattern - pattern.min()) / (pattern.max() - pattern.min() + 1e-10) * 100.0 |
| x = torch.from_numpy(pattern).unsqueeze(0) |
| |
| with torch.no_grad(): |
| out = model(x) |
| |
| cs_probs = torch.softmax(out["cs_logits"], dim=-1) |
| sg_probs = torch.softmax(out["sg_logits"], dim=-1) |
| lp = out["lp"] # [a, b, c, alpha, beta, gamma] |
| |
| print("Crystal system:", AlphaDiffract.CRYSTAL_SYSTEMS[cs_probs.argmax().item()]) |
| print("Space group: #", sg_probs.argmax().item() + 1) |
| print("Lattice params:", lp[0].tolist()) |
| ``` |
|
|
| See `example_inference.py` for a complete runnable example. |
|
|
| ## Files |
|
|
| | File | Description | |
| |---|---| |
| | `model.safetensors` | Model weights (safetensors format, ~35 MB) | |
| | `model.py` | Standalone model definition (pure PyTorch, no Lightning) | |
| | `config.json` | Architecture and training hyperparameters | |
| | `maxsub.json` | Space-group subgroup graph (230Γ230, used as a registered buffer) | |
| | `example_inference.py` | End-to-end inference example | |
| | `LICENSE` | BSD 3-Clause | |
|
|
|
|
| ## Input Format |
|
|
| - **Length:** 8192 equally-spaced intensity values |
| - **2ΞΈ range:** 5β20Β° (monochromatic, 20 keV) |
| - **Preprocessing:** floor negatives at zero, then rescale to [0, 100] |
| - **Shape:** `(batch, 8192)` or `(batch, 1, 8192)` |
|
|
| ## Architecture |
|
|
| 1D ConvNeXt backbone adapted from [Liu et al. (2022)](https://arxiv.org/abs/2201.03545): |
|
|
| ``` |
| Input (8192) β [ConvNeXt Block Γ 3 with AvgPool] β Flatten (560-d) |
| ββ CS head: MLP 560β2300β1150β7 (crystal system) |
| ββ SG head: MLP 560β2300β1150β230 (space group) |
| ββ LP head: MLP 560β512β256β6 (lattice parameters, sigmoid-bounded) |
| ``` |
|
|
| - **Parameters:** 8,734,989 |
| - **Activation:** GELU |
| - **Stochastic depth:** 0.3 |
| - **Head dropout:** 0.5 |
|
|
| ## Performance |
|
|
| This is a **single model** trained on Materials Project data only (no ICSD). |
| Metrics on the best validation checkpoint (epoch 11): |
|
|
| | Metric | Simulated Val | RRUFF (experimental) | |
| |---|---|---| |
| | Crystal system accuracy | 74.88% | 60.35% | |
| | Space group accuracy | 57.31% | 38.28% | |
| | Lattice parameter MAE | 2.71 | β | |
|
|
| The paper reports higher numbers from a 10-model ensemble trained on |
| Materials Project + ICSD combined data. This open-weights release covers |
| only publicly available training data. |
|
|
| ## Training Details |
|
|
| | | | |
| |---|---| |
| | **Data** | ~146k Materials Project structures, 100 GSAS-II simulations each | |
| | **Augmentation** | Poisson + Gaussian noise, rescaled to [0, 100] | |
| | **Optimizer** | AdamW (lr=2e-4, weight_decay=0.01) | |
| | **Scheduler** | CyclicLR (triangular2, 6-epoch half-cycles) | |
| | **Loss** | CE (crystal system) + CE + GEMD (space group) + MSE (lattice params) | |
| | **Hardware** | 1Γ NVIDIA H100, float32 | |
| | **Batch size** | 64 | |
| |
| ## Citation |
| |
| ```bibtex |
| @article{andrejevic2026alphadiffract, |
| title = {AlphaDiffract: Automated Crystallographic Analysis of Powder X-ray Diffraction Data}, |
| author = {Andrejevic, Nina and Du, Ming and Sharma, Hemant and Horwath, James P. and Luo, Aileen and Yin, Xiangyu and Prince, Michael and Toby, Brian H. and Cherukara, Mathew J.}, |
| year = {2026}, |
| eprint = {2603.23367}, |
| archivePrefix = {arXiv}, |
| primaryClass = {cond-mat.mtrl-sci}, |
| doi = {10.48550/arXiv.2603.23367}, |
| url = {https://arxiv.org/abs/2603.23367} |
| } |
| ``` |
| |
| ## License |
| |
| BSD 3-Clause β Copyright 2026 UChicago Argonne, LLC. |
| |
| ## Links |
| |
| - [arXiv: 2603.23367](https://arxiv.org/abs/2603.23367) |
| - [GitHub: OpenAlphaDiffract](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract) |
| - [GitHub: AlphaDiffract](https://github.com/AdvancedPhotonSource/AlphaDiffract) |
| |