laaaarrywang
/

SCDD

@@ -1,15 +1,15 @@
 ---
-license: mit
 library_name: pytorch
 pipeline_tag: text-generation
 tags:
-- discrete-diffusion
-- diffusion-language-model
-- self-correction
-- scdd
-- icml-2026
 datasets:
-- Skylion007/openwebtext
 ---
 # SCDD
@@ -20,13 +20,39 @@ SCDD is a self-correcting discrete diffusion language model. It learns to revise
 ## Checkpoints
-| File | Model | Uniform noise ratio |
-| --- | --- | --- |
-| `checkpoints/scdd_pu_0.1.ckpt` | SCDD (0.1) | `p_u = 0.1` |
-| `checkpoints/scdd_pu_0.2.ckpt` | SCDD (0.2) | `p_u = 0.2` |
 The checkpoint filenames intentionally use `scdd` naming for the public release.
 ## Code
 Code and evaluation scripts are available at:
@@ -42,4 +68,4 @@ Code and evaluation scripts are available at:
   journal={arXiv preprint arXiv:2603.02230},
   year={2026}
 }
-```

 ---
+license: other
 library_name: pytorch
 pipeline_tag: text-generation
 tags:
+  - discrete-diffusion
+  - diffusion-language-model
+  - self-correction
+  - scdd
+  - icml-2026
 datasets:
+  - openwebtext
 ---
 # SCDD
 ## Checkpoints
+| File | Config | Model | Uniform noise ratio |
+| --- | --- | --- | --- |
+| `checkpoints/scdd_pu_0.1.ckpt` | `configs/scdd_pu_0.1.yaml` | SCDD (0.1) | `p_u = 0.1` |
+| `checkpoints/scdd_pu_0.2.ckpt` | `configs/scdd_pu_0.2.yaml` | SCDD (0.2) | `p_u = 0.2` |
 The checkpoint filenames intentionally use `scdd` naming for the public release.
+## Model configuration
+Both checkpoints use the same GPT-2 scale DiT backbone and differ only in the SCDD uniform-noise ratio.
+| Setting | Value |
+| --- | --- |
+| Backbone | DiT / `ddit` |
+| Parameterization | `scdd` |
+| Dataset | OpenWebText |
+| Tokenizer | GPT-2 |
+| Context length | 512 |
+| Hidden size | 768 |
+| Number of blocks | 12 |
+| Number of attention heads | 12 |
+| Conditional dimension | 128 |
+| Dropout | 0.0 |
+| Diffusion steps used in training grid | 1000 |
+| Forward process | `mix` |
+| `gamma` schedule-shape parameter | 1 |
+| Uniform-noise peak time | `t_peak = 0.5` |
+| EMA | 0.9999 |
+| Optimizer | Adam-style optimizer, lr `5e-4`, weight decay `0.02` |
+| Precision | bfloat16 |
+See `configs/scdd_pu_0.1.yaml` and `configs/scdd_pu_0.2.yaml` for sanitized public configuration files.
 ## Code
 Code and evaluation scripts are available at:
   journal={arXiv preprint arXiv:2603.02230},
   year={2026}
 }
+```