Chreode — pretrained backbone

Pretrained weights for Chreode, a one-step cell world model published at NeurIPS 2026.

Paper: arXiv:2605.28111 · Code: github.com/mufanq/Chreode

Files

File	Stage	Architecture	Size
`vae.pt`	Stage 1	scVI encoder–decoder; latent 128; hidden 512; 3 enc + 3 dec layers; Normal likelihood	647 MB
`dynamics_dit.pt`	Stage 2	Waddington-DiT (Small: hidden 384, depth 12, 6 heads, 4 register tokens); experiment `g2a_m10_wdit_time2vecu_lowfreqcurl_uncertainty_adamw`	472 MB
`static_dit.pt`	Stage 2 control	Same architecture as `dynamics_dit.pt` but trained with reconstruction-only objective; used as the control arm for §5.3 (fate) and §5.4 (Norman)	472 MB

How to use

from huggingface_hub import snapshot_download
import torch

ckpt_dir = snapshot_download(repo_id="WhenceFade/chreode-pretrained")

vae          = torch.load(f"{ckpt_dir}/vae.pt",          map_location="cpu", weights_only=False)
dynamics_dit = torch.load(f"{ckpt_dir}/dynamics_dit.pt", map_location="cpu", weights_only=False)
static_dit   = torch.load(f"{ckpt_dir}/static_dit.pt",   map_location="cpu", weights_only=False)

End-to-end loader and the full latent → prediction example are in the companion GitHub repo; see reproduce/01_pretrain.md for the exact config, and reproduce/00_setup.md for environment setup.

Pretraining data

2,477,217 mouse embryonic cells from 7 public datasets, 10 leaf trajectories, 88 sampled timepoints (0 → 19 dpf).
Gene vocabulary: 16,520 mouse–human 1:1 orthologs (Ensembl BioMart, confidence=1).
Preprocessing: normalize_total(1e4) + log1p. Cached preprocessing artifacts: WhenceFade/chreode-phase0.

Training recipe

	Stage 1 (VAE)	Stage 2 (W-DiT)
Steps	1,678 (≈ 2 epochs)	3,356
Batch	4,096	512
Optimizer	Adam (scvi-tools defaults)	AdamW β=(0.9, 0.95), wd=0.01
LR	scvi defaults	3 × 10⁻⁴, 5% cosine warmup
Loss	ELBO (Normal)	MMD + Sinkhorn W₂ + drift + downhill (1 : 1 : 1 : 0.1)
Hardware	1 × A100	1 × A100
Wall-clock	≈ 12 h	≈ 18 h

Reported metrics (paper Tables 1–7)

When this backbone is plugged into the downstream evaluation in mufanq/Chreode:

Task	Metric	Chreode	Best baseline
Weinreb d6 fine-tune	Sinkhorn W₂ ↓	1.688 ± 0.036	PI-SDE 1.840
Veres avg t1–t7 fine-tune	Sinkhorn W₂ ↓	2.617	PI-SDE 2.830
Weinreb fate zero-shot	Pearson r ↑	0.468	scDiffEq 0.463
Norman GEARS embedding replace	DE20 MSE ↓	0.18580 (−12.4%)	GEARS 0.21208
Inference latency (A100 fp32 b1)	ms / NFE	65 ms / 1	PRESCIENT 194 / many

Three downstream tasks include fine-tuning; the fate task is zero-shot.

Intended use

Predict population-level transitions $p(z_{t+\Delta} \mid z_t, \mathrm{do}(a))$ on single-cell transcriptomics, with a one-pass residual generator.
Use as a starting point for fine-tuning on new developmental or perturbation atlases that share the mouse–human 1:1 ortholog vocabulary.
Use as a gene-state embedding inside other perturbation predictors (e.g. GEARS).

Out-of-scope use

Not a general-purpose representation learner — for cell-type annotation, integration, or gene-network reasoning, prefer Geneformer / scGPT.
Trained only on mouse embryonic data. Cross-species transfer is mediated by 1:1 orthologs; adult-human tissues are out of distribution.
The fine-tuned Norman headline (DE20 MSE 0.18580) is a single-seed number; see reproduce/known_issues.md.

Bias, risks, and limitations

Training data is heavily biased toward early embryonic development; cell-state coverage in adult tissues is poor.
The model is a predictive generator, not a causal one, even though we condition on do(a) notationally. For mechanistic claims, treat predictions as hypotheses, not endpoints.
Same atlas-level confounders (batch / lab / donor heterogeneity) carry into latent space.

License

MIT — see the GitHub repository.

Citation

@inproceedings{qiu2026chreode,
  title     = {Chreode: A Cell World Model for One-Step Temporal Dynamics and Perturbation Prediction},
  author    = {Qiu, Mufan and Zheng, Genhui and Xu, Yinuo and Zhang, Ruichen and Ding, Ying and Long, Qi and Chen, Tianlong},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2026},
  eprint    = {2605.28111},
  archivePrefix = {arXiv},
  primaryClass  = {q-bio.QM}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for WhenceFade/chreode-pretrained

Chreode: A Cell World Model for One-Step Temporal Dynamics and Perturbation Prediction

Paper • 2605.28111 • Published 5 days ago