crystalite-base / README.md

Jarrodbarnes

Upload README.md with huggingface_hub

4bac69a verified 2 days ago

preview code

raw

history blame contribute delete

2.35 kB

metadata

license: mit
tags:
  - crystal-generation
  - diffusion-model
  - materials-science
  - probe-gradient-guidance
library_name: pytorch

Crystalite 10K (Alex-MP-20)

Crystalite checkpoint trained for 10K steps on the full Alex-MP-20 dataset (540K structures, 97.9% metals). This is the diversity-optimized model used for the Pareto sweep experiments.

Architecture: 67.8M-parameter Diffusion Transformer with subatomic tokenizer and GEM attention bias (Crystalite, Hadzi Veljkovic et al.).

Key results with probe-gradient guidance

Guidance weight	In-window (4-6 eV)	Uniqueness	Metal %
0 (baseline)	0.1%	99.7%	96.9%
10	31.8%	99.7%	0.1%
15	33.7%	99.6%	0.0%

Every guidance weight Pareto-dominates the baseline. 18,432 structures across 6 weights, 3 seeds, 1,024 per batch. No mode collapse.

Band gap probe AUROC: 0.957 (256 parameters, trained on atom-mean hidden states).

Usage

Requires the Crystalite codebase and probe-gradient-guidance scripts.

from scripts.train_probe import load_model
model = load_model("final.pt", device="cuda")

Used In

This checkpoint was used as an upstream generation asset in the open-world environment pipeline for Training Scientific Judgment with Verified Environments for Autonomous Science.

Scientific judgment blog post: Training Scientific Judgment
Public repo: Dynamical-Systems-Research/training-scientific-judgment
Paper PDF: Training Scientific Judgment with Verified Environments for Autonomous Science

Dynamical-Systems
/

crystalite-base

Crystalite 10K (Alex-MP-20)

Key results with probe-gradient guidance

Usage

Links

Used In