File size: 2,514 Bytes
81dd8f1
 
71845f6
 
81dd8f1
71845f6
 
 
 
 
81dd8f1
 
 
 
 
 
71845f6
 
81dd8f1
 
71845f6
 
 
 
81dd8f1
71845f6
81dd8f1
 
 
 
 
 
 
 
 
 
 
 
71845f6
81dd8f1
 
 
 
 
 
 
71845f6
81dd8f1
 
 
 
 
 
 
71845f6
81dd8f1
 
 
 
 
 
 
 
 
 
 
 
 
71845f6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
library_name: pytorch
license: mit
pipeline_tag: text-generation
tags:
- gdds
- discrete-diffusion
- language-modeling
- research
- pytorch
---

# GDDS Checkpoints

Official checkpoint bundle for the paper **Generalized Discrete Diffusion from Snapshots**.

Generalized Discrete Diffusion from Snapshots (GDDS) is a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. It introduces a training objective based on snapshot latents rather than the entire noising path, allowing for efficient training and high-quality generation.

## Model Sources

- **Paper:** [Generalized Discrete Diffusion from Snapshots](https://huggingface.co/papers/2603.21342)
- **arXiv:** [2603.21342](https://arxiv.org/abs/2603.21342)
- **Code:** [GitHub - ozekri/gdds](https://github.com/ozekri/gdds)
- **Project Page:** [https://oussamazekri.fr/gdds](https://oussamazekri.fr/gdds)

## Included Checkpoints

| File | Method | Notes |
| --- | --- | --- |
| `checkpoints/gdds_gauss_500k.ckpt` | GDDS | 500k-step checkpoint with the Gaussian SIK forward process |
| `checkpoints/gdds_uniform_500k.ckpt` | GDDS | 500k-step checkpoint with the uniform forward process |
| `checkpoints/gdds_absorb_500k.ckpt` | GDDS | 500k-step checkpoint with the absorbing forward process |
| `checkpoints/mdlm_500k.ckpt` | MDLM | 500k-step baseline checkpoint |
| `checkpoints/udlm_500k.ckpt` | UDLM | 500k-step baseline checkpoint |
| `checkpoints/ar_500k.ckpt` | AR | 500k-step autoregressive baseline checkpoint |

## Usage

These files are PyTorch Lightning checkpoints intended to be used with the [`gdds`](https://github.com/ozekri/gdds) codebase.

```bash
git clone https://github.com/ozekri/gdds.git
cd gdds
pip install -r requirements.txt
pip install -e .

# Example evaluation using a checkpoint
PYTHONPATH=src python -m discrete_diffusion.evaluations.ppl_eval \
  data=openwebtext \
  model=small \
  algo=mdlm \
  eval.checkpoint_path=/path/to/checkpoints/mdlm_500k.ckpt
```

For sampling and other evaluations, use the same repository and pass the relevant checkpoint path through the Hydra evaluation config.

## Citation

```bibtex
@misc{zekri2026generalizeddiscretediffusionsnapshots,
  title={Generalized Discrete Diffusion from Snapshots},
  author={Oussama Zekri and Th{\\'e}o Uscidda and Nicolas Boull{\\'e} and Anna Korba},
  year={2026},
  eprint={2603.21342},
  archivePrefix={arXiv},
  primaryClass={stat.ML},
  url={https://arxiv.org/abs/2603.21342},
}
```