--- license: bsd-3-clause language: - en tags: - pytorch - materials-science - crystallography - x-ray-diffraction - pxrd - convnext - arxiv:2603.23367 datasets: - materials-project metrics: - accuracy - mae pipeline_tag: other --- # Open AlphaDiffract [arXiv](https://arxiv.org/abs/2603.23367) | [GitHub](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract) **Automated crystallographic analysis of powder X-ray diffraction data.** AlphaDiffract is a multi-task 1D ConvNeXt model that takes a powder X-ray diffraction (PXRD) pattern and simultaneously predicts: | Output | Description | |---|---| | **Crystal system** | 7-class classification (Triclinic → Cubic) | | **Space group** | 230-class classification | | **Lattice parameters** | 6 values: a, b, c (Å), α, β, γ (°) | This release contains a **single model** trained exclusively on [Materials Project](https://next-gen.materialsproject.org/) structures (publicly available data). It is *not* the 10-model ensemble reported in the paper — see [Performance](#performance) for details. ## Quick Start ```bash pip install torch safetensors huggingface-cli download linked-liszt/OpenAlphaDiffract --local-dir OpenAlphaDiffract ``` ```python import torch import numpy as np from OpenAlphaDiffract.model import AlphaDiffract model = AlphaDiffract.from_pretrained("OpenAlphaDiffract", device="cpu") # 8192-point intensity pattern, normalized to [0, 100] pattern = np.load("my_pattern.npy").astype(np.float32) pattern = (pattern - pattern.min()) / (pattern.max() - pattern.min() + 1e-10) * 100.0 x = torch.from_numpy(pattern).unsqueeze(0) with torch.no_grad(): out = model(x) cs_probs = torch.softmax(out["cs_logits"], dim=-1) sg_probs = torch.softmax(out["sg_logits"], dim=-1) lp = out["lp"] # [a, b, c, alpha, beta, gamma] print("Crystal system:", AlphaDiffract.CRYSTAL_SYSTEMS[cs_probs.argmax().item()]) print("Space group: #", sg_probs.argmax().item() + 1) print("Lattice params:", lp[0].tolist()) ``` See `example_inference.py` for a complete runnable example. ## Files | File | Description | |---|---| | `model.safetensors` | Model weights (safetensors format, ~35 MB) | | `model.py` | Standalone model definition (pure PyTorch, no Lightning) | | `config.json` | Architecture and training hyperparameters | | `maxsub.json` | Space-group subgroup graph (230×230, used as a registered buffer) | | `example_inference.py` | End-to-end inference example | | `LICENSE` | BSD 3-Clause | ## Input Format - **Length:** 8192 equally-spaced intensity values - **2θ range:** 5–20° (monochromatic, 20 keV) - **Preprocessing:** floor negatives at zero, then rescale to [0, 100] - **Shape:** `(batch, 8192)` or `(batch, 1, 8192)` ## Architecture 1D ConvNeXt backbone adapted from [Liu et al. (2022)](https://arxiv.org/abs/2201.03545): ``` Input (8192) → [ConvNeXt Block × 3 with AvgPool] → Flatten (560-d) ├─ CS head: MLP 560→2300→1150→7 (crystal system) ├─ SG head: MLP 560→2300→1150→230 (space group) └─ LP head: MLP 560→512→256→6 (lattice parameters, sigmoid-bounded) ``` - **Parameters:** 8,734,989 - **Activation:** GELU - **Stochastic depth:** 0.3 - **Head dropout:** 0.5 ## Performance This is a **single model** trained on Materials Project data only (no ICSD). Metrics on the best validation checkpoint (epoch 11): | Metric | Simulated Val | RRUFF (experimental) | |---|---|---| | Crystal system accuracy | 74.88% | 60.35% | | Space group accuracy | 57.31% | 38.28% | | Lattice parameter MAE | 2.71 | — | The paper reports higher numbers from a 10-model ensemble trained on Materials Project + ICSD combined data. This open-weights release covers only publicly available training data. ## Training Details | | | |---|---| | **Data** | ~146k Materials Project structures, 100 GSAS-II simulations each | | **Augmentation** | Poisson + Gaussian noise, rescaled to [0, 100] | | **Optimizer** | AdamW (lr=2e-4, weight_decay=0.01) | | **Scheduler** | CyclicLR (triangular2, 6-epoch half-cycles) | | **Loss** | CE (crystal system) + CE + GEMD (space group) + MSE (lattice params) | | **Hardware** | 1× NVIDIA H100, float32 | | **Batch size** | 64 | ## Citation ```bibtex @article{andrejevic2026alphadiffract, title = {AlphaDiffract: Automated Crystallographic Analysis of Powder X-ray Diffraction Data}, author = {Andrejevic, Nina and Du, Ming and Sharma, Hemant and Horwath, James P. and Luo, Aileen and Yin, Xiangyu and Prince, Michael and Toby, Brian H. and Cherukara, Mathew J.}, year = {2026}, eprint = {2603.23367}, archivePrefix = {arXiv}, primaryClass = {cond-mat.mtrl-sci}, doi = {10.48550/arXiv.2603.23367}, url = {https://arxiv.org/abs/2603.23367} } ``` ## License BSD 3-Clause — Copyright 2026 UChicago Argonne, LLC. ## Links - [arXiv: 2603.23367](https://arxiv.org/abs/2603.23367) - [GitHub: OpenAlphaDiffract](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract) - [GitHub: AlphaDiffract](https://github.com/AdvancedPhotonSource/AlphaDiffract)