Update model card with configuration details
Browse files
README.md
CHANGED
|
@@ -1,15 +1,15 @@
|
|
| 1 |
---
|
| 2 |
-
license:
|
| 3 |
library_name: pytorch
|
| 4 |
pipeline_tag: text-generation
|
| 5 |
tags:
|
| 6 |
-
- discrete-diffusion
|
| 7 |
-
- diffusion-language-model
|
| 8 |
-
- self-correction
|
| 9 |
-
- scdd
|
| 10 |
-
- icml-2026
|
| 11 |
datasets:
|
| 12 |
-
-
|
| 13 |
---
|
| 14 |
|
| 15 |
# SCDD
|
|
@@ -20,13 +20,39 @@ SCDD is a self-correcting discrete diffusion language model. It learns to revise
|
|
| 20 |
|
| 21 |
## Checkpoints
|
| 22 |
|
| 23 |
-
| File | Model | Uniform noise ratio |
|
| 24 |
-
| --- | --- | --- |
|
| 25 |
-
| `checkpoints/scdd_pu_0.1.ckpt` | SCDD (0.1) | `p_u = 0.1` |
|
| 26 |
-
| `checkpoints/scdd_pu_0.2.ckpt` | SCDD (0.2) | `p_u = 0.2` |
|
| 27 |
|
| 28 |
The checkpoint filenames intentionally use `scdd` naming for the public release.
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
## Code
|
| 31 |
|
| 32 |
Code and evaluation scripts are available at:
|
|
@@ -42,4 +68,4 @@ Code and evaluation scripts are available at:
|
|
| 42 |
journal={arXiv preprint arXiv:2603.02230},
|
| 43 |
year={2026}
|
| 44 |
}
|
| 45 |
-
```
|
|
|
|
| 1 |
---
|
| 2 |
+
license: other
|
| 3 |
library_name: pytorch
|
| 4 |
pipeline_tag: text-generation
|
| 5 |
tags:
|
| 6 |
+
- discrete-diffusion
|
| 7 |
+
- diffusion-language-model
|
| 8 |
+
- self-correction
|
| 9 |
+
- scdd
|
| 10 |
+
- icml-2026
|
| 11 |
datasets:
|
| 12 |
+
- openwebtext
|
| 13 |
---
|
| 14 |
|
| 15 |
# SCDD
|
|
|
|
| 20 |
|
| 21 |
## Checkpoints
|
| 22 |
|
| 23 |
+
| File | Config | Model | Uniform noise ratio |
|
| 24 |
+
| --- | --- | --- | --- |
|
| 25 |
+
| `checkpoints/scdd_pu_0.1.ckpt` | `configs/scdd_pu_0.1.yaml` | SCDD (0.1) | `p_u = 0.1` |
|
| 26 |
+
| `checkpoints/scdd_pu_0.2.ckpt` | `configs/scdd_pu_0.2.yaml` | SCDD (0.2) | `p_u = 0.2` |
|
| 27 |
|
| 28 |
The checkpoint filenames intentionally use `scdd` naming for the public release.
|
| 29 |
|
| 30 |
+
## Model configuration
|
| 31 |
+
|
| 32 |
+
Both checkpoints use the same GPT-2 scale DiT backbone and differ only in the SCDD uniform-noise ratio.
|
| 33 |
+
|
| 34 |
+
| Setting | Value |
|
| 35 |
+
| --- | --- |
|
| 36 |
+
| Backbone | DiT / `ddit` |
|
| 37 |
+
| Parameterization | `scdd` |
|
| 38 |
+
| Dataset | OpenWebText |
|
| 39 |
+
| Tokenizer | GPT-2 |
|
| 40 |
+
| Context length | 512 |
|
| 41 |
+
| Hidden size | 768 |
|
| 42 |
+
| Number of blocks | 12 |
|
| 43 |
+
| Number of attention heads | 12 |
|
| 44 |
+
| Conditional dimension | 128 |
|
| 45 |
+
| Dropout | 0.0 |
|
| 46 |
+
| Diffusion steps used in training grid | 1000 |
|
| 47 |
+
| Forward process | `mix` |
|
| 48 |
+
| `gamma` schedule-shape parameter | 1 |
|
| 49 |
+
| Uniform-noise peak time | `t_peak = 0.5` |
|
| 50 |
+
| EMA | 0.9999 |
|
| 51 |
+
| Optimizer | Adam-style optimizer, lr `5e-4`, weight decay `0.02` |
|
| 52 |
+
| Precision | bfloat16 |
|
| 53 |
+
|
| 54 |
+
See `configs/scdd_pu_0.1.yaml` and `configs/scdd_pu_0.2.yaml` for sanitized public configuration files.
|
| 55 |
+
|
| 56 |
## Code
|
| 57 |
|
| 58 |
Code and evaluation scripts are available at:
|
|
|
|
| 68 |
journal={arXiv preprint arXiv:2603.02230},
|
| 69 |
year={2026}
|
| 70 |
}
|
| 71 |
+
```
|