davanstrien HF Staff commited on
Commit
297f63d
·
verified ·
1 Parent(s): f7bc385

v2 results: add parameter-matched Gemma-4-26B MoE arm to ledger; update standfirst claim accordingly

Browse files
index.html CHANGED
@@ -205,8 +205,9 @@
205
  </p>
206
  <p class="standfirst" style="margin-top:.5rem">
207
  <b>How to use it:</b> pick a passage below (or paste your own), press <b>Correct this text</b>, and watch the
208
- correction emerge step by step. On a 75‑passage benchmark the diffusion model corrected <em>more accurately</em>
209
- than the autoregressive baselineand roughly 8× faster.
 
210
  </p>
211
  <p class="standfirst" style="margin-top:.5rem; font-size:.88rem; font-style:italic; color:var(--ink-soft)">
212
  All experiments ran on <a href="https://huggingface.co/docs/hub/jobs">Hugging Face Jobs</a>
 
205
  </p>
206
  <p class="standfirst" style="margin-top:.5rem">
207
  <b>How to use it:</b> pick a passage below (or paste your own), press <b>Correct this text</b>, and watch the
208
+ correction emerge step by step. On a 75‑passage benchmark the most accurate corrector was the
209
+ parameter‑matched autoregressive model (Gemma‑4‑26B MoE, see the results ledger)but the diffusion
210
+ model came close, roughly <em>10× faster</em>, and beat the smaller AR baseline on both quality and speed.
211
  </p>
212
  <p class="standfirst" style="margin-top:.5rem; font-size:.88rem; font-style:italic; color:var(--ink-soft)">
213
  All experiments ran on <a href="https://huggingface.co/docs/hub/jobs">Hugging Face Jobs</a>
results/per_passage_metrics.jsonl CHANGED
The diff for this file is too large to render. See raw diff
 
results/summary.md CHANGED
@@ -5,19 +5,18 @@ Passages: 75 · macro means over passages (micro CER in footnote)
5
  | Model | CER ↓ | WER ↓ | Rel. CER reduction ↑ | Over-correction ↓ | Fix rate ↑ | Median s/passage | tok/s |
6
  |---|---|---|---|---|---|---|---|
7
  | OCR input (uncorrected) | 0.066 | 0.215 | — | — | — | — | — |
8
- | DiffusionGemma 26B-A4B-it | 0.036 | 0.076 | 49.4% | 1.4% | 85.2% | 1.74 | 118.7 |
9
- | DiffusionGemma (OCR-seeded canvas) | 0.081 | 0.226 | -17.2% | 0.0% | 0.6% | 0.70 | 323.2 |
10
- | Gemma-4-E4B-it | 0.042 | 0.107 | 45.9% | 0.4% | 61.5% | 14.68 | 13.7 |
11
 
12
- Micro (corpus-level) CER — input: 0.062, DiffusionGemma 26B-A4B-it: 0.033, DiffusionGemma (OCR-seeded canvas): 0.075, Gemma-4-E4B-it: 0.038.
13
- Mean denoising steps, DiffusionGemma 26B-A4B-it: 10.1 (max 48).
14
- Mean denoising steps, DiffusionGemma (OCR-seeded canvas): 3.3 (max 48).
15
 
16
  ## Config
17
 
18
  ```json
19
  {
20
- "date": "2026-06-10",
21
  "dataset": "bln600",
22
  "n": 75,
23
  "seed": 42,
@@ -29,7 +28,8 @@ Mean denoising steps, DiffusionGemma (OCR-seeded canvas): 3.3 (max 48).
29
  "generation": {
30
  "diffusiongemma": "generation_config defaults (entropy sampler), max_new_tokens=256",
31
  "diffusiongemma_canvas": "as diffusiongemma, but first canvas seeded with the OCR text via decoder_input_ids (random tail padding, seed 0)",
32
- "gemma4": "do_sample=False (greedy), max_new_tokens=256"
 
33
  }
34
  }
35
  ```
 
5
  | Model | CER ↓ | WER ↓ | Rel. CER reduction ↑ | Over-correction ↓ | Fix rate ↑ | Median s/passage | tok/s |
6
  |---|---|---|---|---|---|---|---|
7
  | OCR input (uncorrected) | 0.066 | 0.215 | — | — | — | — | — |
8
+ | DiffusionGemma 26B-A4B-it | 0.035 | 0.073 | 49.5% | 1.5% | 86.0% | 1.69 | 119.9 |
9
+ | Gemma-4-E4B-it | 0.042 | 0.107 | 45.9% | 0.4% | 61.5% | 15.33 | 12.9 |
10
+ | Gemma-4-26B-A4B-it (MoE) | 0.027 | 0.061 | 62.4% | 0.9% | 87.5% | 16.31 | 12.0 |
11
 
12
+ Micro (corpus-level) CER — input: 0.062, DiffusionGemma 26B-A4B-it: 0.032, Gemma-4-E4B-it: 0.038, Gemma-4-26B-A4B-it (MoE): 0.025.
13
+ Mean denoising steps, DiffusionGemma 26B-A4B-it: 9.5 (max 48).
 
14
 
15
  ## Config
16
 
17
  ```json
18
  {
19
+ "date": "2026-06-11",
20
  "dataset": "bln600",
21
  "n": 75,
22
  "seed": 42,
 
28
  "generation": {
29
  "diffusiongemma": "generation_config defaults (entropy sampler), max_new_tokens=256",
30
  "diffusiongemma_canvas": "as diffusiongemma, but first canvas seeded with the OCR text via decoder_input_ids (random tail padding, seed 0)",
31
+ "gemma4": "do_sample=False (greedy), max_new_tokens=256",
32
+ "gemma4_moe": "do_sample=False (greedy), max_new_tokens=256"
33
  }
34
  }
35
  ```