Spaces:

davanstrien
/

diffusiongemma-ocr-correction

Running on Zero

davanstrien HF Staff commited on Jun 11

Commit

297f63d

verified ·

1 Parent(s): f7bc385

v2 results: add parameter-matched Gemma-4-26B MoE arm to ledger; update standfirst claim accordingly

Files changed (3) hide show

index.html CHANGED Viewed

@@ -205,8 +205,9 @@
   </p>
   <p class="standfirst" style="margin-top:.5rem">
     <b>How to use it:</b> pick a passage below (or paste your own), press <b>Correct this text</b>, and watch the
-    correction emerge step by step. On a 75‑passage benchmark the diffusion model corrected <em>more accurately</em>
-    than the autoregressive baseline — and roughly 8× faster.
   </p>
   <p class="standfirst" style="margin-top:.5rem; font-size:.88rem; font-style:italic; color:var(--ink-soft)">
     All experiments ran on <a href="https://huggingface.co/docs/hub/jobs">Hugging Face Jobs</a>

   </p>
   <p class="standfirst" style="margin-top:.5rem">
     <b>How to use it:</b> pick a passage below (or paste your own), press <b>Correct this text</b>, and watch the
+    correction emerge step by step. On a 75‑passage benchmark the most accurate corrector was the
+    parameter‑matched autoregressive model (Gemma‑4‑26B MoE, see the results ledger) — but the diffusion
+    model came close, roughly <em>10× faster</em>, and beat the smaller AR baseline on both quality and speed.
   </p>
   <p class="standfirst" style="margin-top:.5rem; font-size:.88rem; font-style:italic; color:var(--ink-soft)">
     All experiments ran on <a href="https://huggingface.co/docs/hub/jobs">Hugging Face Jobs</a>

results/per_passage_metrics.jsonl CHANGED Viewed

The diff for this file is too large to render. See raw diff

results/summary.md CHANGED Viewed

@@ -5,19 +5,18 @@ Passages: 75 · macro means over passages (micro CER in footnote)
 | Model | CER ↓ | WER ↓ | Rel. CER reduction ↑ | Over-correction ↓ | Fix rate ↑ | Median s/passage | tok/s |
 |---|---|---|---|---|---|---|---|
 | OCR input (uncorrected) | 0.066 | 0.215 | — | — | — | — | — |
-| DiffusionGemma 26B-A4B-it | 0.036 | 0.076 | 49.4% | 1.4% | 85.2% | 1.74 | 118.7 |
-| DiffusionGemma (OCR-seeded canvas) | 0.081 | 0.226 | -17.2% | 0.0% | 0.6% | 0.70 | 323.2 |
-| Gemma-4-E4B-it | 0.042 | 0.107 | 45.9% | 0.4% | 61.5% | 14.68 | 13.7 |
-Micro (corpus-level) CER — input: 0.062, DiffusionGemma 26B-A4B-it: 0.033, DiffusionGemma (OCR-seeded canvas): 0.075, Gemma-4-E4B-it: 0.038.
-Mean denoising steps, DiffusionGemma 26B-A4B-it: 10.1 (max 48).
-Mean denoising steps, DiffusionGemma (OCR-seeded canvas): 3.3 (max 48).
 ## Config
 ```json
 {
-  "date": "2026-06-10",
   "dataset": "bln600",
   "n": 75,
   "seed": 42,
@@ -29,7 +28,8 @@ Mean denoising steps, DiffusionGemma (OCR-seeded canvas): 3.3 (max 48).
   "generation": {
     "diffusiongemma": "generation_config defaults (entropy sampler), max_new_tokens=256",
     "diffusiongemma_canvas": "as diffusiongemma, but first canvas seeded with the OCR text via decoder_input_ids (random tail padding, seed 0)",
-    "gemma4": "do_sample=False (greedy), max_new_tokens=256"
   }
 }
 ```

 | Model | CER ↓ | WER ↓ | Rel. CER reduction ↑ | Over-correction ↓ | Fix rate ↑ | Median s/passage | tok/s |
 |---|---|---|---|---|---|---|---|
 | OCR input (uncorrected) | 0.066 | 0.215 | — | — | — | — | — |
+| DiffusionGemma 26B-A4B-it | 0.035 | 0.073 | 49.5% | 1.5% | 86.0% | 1.69 | 119.9 |
+| Gemma-4-E4B-it | 0.042 | 0.107 | 45.9% | 0.4% | 61.5% | 15.33 | 12.9 |
+| Gemma-4-26B-A4B-it (MoE) | 0.027 | 0.061 | 62.4% | 0.9% | 87.5% | 16.31 | 12.0 |
+Micro (corpus-level) CER — input: 0.062, DiffusionGemma 26B-A4B-it: 0.032, Gemma-4-E4B-it: 0.038, Gemma-4-26B-A4B-it (MoE): 0.025.
+Mean denoising steps, DiffusionGemma 26B-A4B-it: 9.5 (max 48).
 ## Config
 ```json
 {
+  "date": "2026-06-11",
   "dataset": "bln600",
   "n": 75,
   "seed": 42,
   "generation": {
     "diffusiongemma": "generation_config defaults (entropy sampler), max_new_tokens=256",
     "diffusiongemma_canvas": "as diffusiongemma, but first canvas seeded with the OCR text via decoder_input_ids (random tail padding, seed 0)",
+    "gemma4": "do_sample=False (greedy), max_new_tokens=256",
+    "gemma4_moe": "do_sample=False (greedy), max_new_tokens=256"
   }
 }
 ```