abidlabs HF Staff commited on
Commit
3703809
Β·
1 Parent(s): f8a8825

Add diffusion language model API example

Browse files
README.md CHANGED
@@ -20,6 +20,9 @@ results + artifacts (trackio dashboards, jobs, models) over resumable SSE.
20
  The Space root serves the interactive API documentation. Endpoints live
21
  under `/v1` β€” start at `POST /v1/responses`.
22
 
 
 
 
23
  Source: https://github.com/huggingface/ml-intern (see `docs/API.md`).
24
 
25
  Optional Space secrets:
 
20
  The Space root serves the interactive API documentation. Endpoints live
21
  under `/v1` β€” start at `POST /v1/responses`.
22
 
23
+ The docs include verified public-endpoint examples, starting with short
24
+ research/model-selection tasks and extending to longer artifact-producing runs.
25
+
26
  Source: https://github.com/huggingface/ml-intern (see `docs/API.md`).
27
 
28
  Optional Space secrets:
api-docs/example-results/diffusion-language-models.json ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "request": {
3
+ "slug": "diffusion-language-models",
4
+ "title": "Research a cutting-edge concept",
5
+ "model": "moonshotai/Kimi-K2.6:novita",
6
+ "input": "Research diffusion language models for text generation. In 5 concise bullets, explain what changed recently, why it matters, and cite 2 relevant Hugging Face paper pages or model pages if available. Keep under 300 words.",
7
+ "background": true,
8
+ "max_cost_usd": 3.0,
9
+ "metadata": {
10
+ "example_slug": "diffusion-language-models"
11
+ }
12
+ },
13
+ "response": {
14
+ "id": "resp_bf64687c764f4d509a134188390a2236",
15
+ "object": "response",
16
+ "created_at": 1781313817,
17
+ "completed_at": 1781313849,
18
+ "status": "completed",
19
+ "model": "moonshotai/Kimi-K2.6:novita",
20
+ "background": true,
21
+ "previous_response_id": null,
22
+ "session_id": "7f8b617e-640d-4aa8-a8df-1402037b7c71",
23
+ "max_cost_usd": 3.0,
24
+ "instructions": null,
25
+ "output": [
26
+ {
27
+ "type": "custom_tool_call",
28
+ "id": "functions.hf_papers:0",
29
+ "name": "hf_papers",
30
+ "input": "{\"operation\": \"search\", \"query\": \"diffusion language models text generation\", \"date_from\": \"2024-01-01\", \"limit\": 15}",
31
+ "output": "# Papers matching 'diffusion language models text generation'\nShowing 15 paper(s)\n\n## 1. Energy-Based Diffusion Language Models for Text Generation\n**arxiv_id:** 2410.21357 | **upvotes:** 3\nhttps://huggingface.co/papers/2410.21357\n**Keywords:** discrete diffusion models, parallel generation, Energy-based Diffusion Language Model, energy-based model, residual form\n**GitHub:** https://github.com/MinkaiXu/Energy-Diffusion-LLM (69 stars)\n**Summary:** Energy-based Diffusion Language Model improves diffusion-based generation by using an energy-based model with parameter optimization from autoregressive models and achieves faster sampling speeds with comparable performance to autoregressive models.\n\n## 2. Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models\n**arxiv_id:** 2604.02560 | **upvotes:** 0\nhttps://huggingface.co/papers/2604.02560\n**Keywords:** discrete diffusion language models, parallel decoding, distributional mismatch, joint conditional, per-token marginals\n**Summary:** Discrete diffusion language models accelerate text generation by parallel token unmasking, but this introduces distributional mismatch; a dependency predictor called DEMASK addresses this by identifying bounded cumulative dependency positions for simultaneous unmasking under sub-additivity assumptio...\n\n## 3. DPad: Efficient Diffusion Language Models with Suffix Dropout\n**arxiv_id:** 2508.14148 | **upvotes:** 0\nhttps://huggingface.co/papers/2508.14148\n**Keywords:** diffusion-based large language models, denoising process, attention, sliding window, distance-decay dropout\n**GitHub:** https://github.com/Crys-Chen/DPad (62 stars)\n**Summary:** Diffusion Scratchpad (DPad) enhances the efficiency of diffusion-based large language models by restricting attention to nearby tokens, achieving significant speedup without compromising accuracy.\n\n## 4. Unveiling the Potential of Diffusion Large Language Model in\n Controllable Generation\n**arxiv_id:** 2507.04504 | **upvotes:** 1\nhttps://huggingface.co/papers/2507.04504\n**Keywords:** diffusion models, autoregressive large language models, dLLMs, bidirectional attention mechanism, controllable generation\n**Summary:** A novel framework, S<sup>3</sup>, enhances diffusion language models for controllable text generation by improving context modeling, reducing hallucinations, and accelerating inference.\n\n## 5. FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion\n Language Models\n**arxiv_id:** 2509.20624 | **upvotes:** 1\nhttps://huggingface.co/papers/2509.20624\n**Summary:** Autoregressive language models (ARMs) deliver strong likelihoods, but are\ninherently serial: they generate one token per forward pass, which limits\nthroughput and inflates latency for long sequences. Diffusion Language Models\n(DLMs) parallelize across positions and thus appear promising for language...\n\n## 6. Text Generation with Diffusion Language Models: A Pre-training Approach\n with Continuous Paragraph Denoise\n**arxiv_id:** 2212.11685 | **upvotes:** 2\nhttps://huggingface.co/papers/2212.11685\n**Keywords:** diffusion language model, encoder, diffusion-based decoder, continuous paragraph denoise objective, semantic coherence\n**Summary:** GENIE, a diffusion-based language model with an encoder and diffusion decoder, achieves state-of-the-art performance in text generation by using a continuous paragraph denoise objective.\n\n## 7. A Survey on Parallel Text Generation: From Parallel Decoding to\n Diffusion Language Models\n**arxiv_id:** 2508.08712 | **upvotes:** 0\nhttps://huggingface.co/papers/2508.08712\n**Keywords:** Large Language Models, autoregressive, parallel text generation, token-by-token, inference efficiency\n**GitHub:** https://github.com/zhanglingzhe0820/Awesome-Parallel-Text-Generation (43 stars)\n**Summary:** A survey of parallel text generation methods categorizes and analyzes AR-based and Non-AR-based techniques, assessing their trade-offs and potential for improving inference efficiency in LLMs.\n\n## 8. CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block\n Prediction and Controllabl\u2026 [truncated 4109 chars]",
32
+ "status": "completed"
33
+ },
34
+ {
35
+ "type": "custom_tool_call",
36
+ "id": "functions.web_search:1",
37
+ "name": "web_search",
38
+ "input": "{\"query\": \"site:huggingface.co diffusion language model text generation\"}",
39
+ "output": "{\n \"query\": \"site:huggingface.co diffusion language model text generation\",\n \"results\": [\n \"Search results for 'site:huggingface.co diffusion language model text generation'. Include a Sources section in the final answer.\\n- [Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion ...](https://huggingface.co/blog/nvidia/nemotron-labs-diffusion)\\n- [Diffusion Language Models: The New Paradigm - Hugging Face](https://huggingface.co/blog/ProCreations/diffusion-language-model)\\n- [Text diffusion - a stereoplegic Collection - Hugging Face](https://huggingface.co/collections/stereoplegic/text-diffusion)\\n- [google/diffusiongemma-26B-A4B-it \\u00b7 Hugging Face](https://huggingface.co/google/diffusiongemma-26B-A4B-it)\\n- [Paper page - Diffusion-LM Improves Controllable Text Generation](https://huggingface.co/papers/2205.14217)\\n- [Paper page - Text Generation with Diffusion Language Models: A Pre ...](https://huggingface.co/papers/2212.11685)\\n- [Energy-Based Diffusion Language Models for Text Generation](https://huggingface.co/papers/2410.21357)\\n- [SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for ...](https://huggingface.co/papers/2210.17432)\",\n {\n \"tool_use_id\": \"functions.web_search:1\",\n \"content\": [\n {\n \"title\": \"Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion ...\",\n \"url\": \"https://huggingface.co/blog/nvidia/nemotron-labs-diffusion\"\n },\n {\n \"title\": \"Diffusion Language Models: The New Paradigm - Hugging Face\",\n \"url\": \"https://huggingface.co/blog/ProCreations/diffusion-language-model\"\n },\n {\n \"title\": \"Text diffusion - a stereoplegic Collection - Hugging Face\",\n \"url\": \"https://huggingface.co/collections/stereoplegic/text-diffusion\"\n },\n {\n \"title\": \"google/diffusiongemma-26B-A4B-it \\u00b7 Hugging Face\",\n \"url\": \"https://huggingface.co/google/diffusiongemma-26B-A4B-it\"\n },\n {\n \"title\": \"Paper page - Diffusion-LM Improves Controllable Text Generation\",\n \"url\": \"https://huggingface.co/papers/2205.14217\"\n },\n {\n \"title\": \"Paper page - Text Generation with Diffusion Language Models: A Pre ...\",\n \"url\": \"https://huggingface.co/papers/2212.11685\"\n },\n {\n \"title\": \"Energy-Based Diffusion Language Models for Text Generation\",\n \"url\": \"https://huggingface.co/papers/2410.21357\"\n },\n {\n \"title\": \"SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for ...\",\n \"url\": \"https://huggingface.co/papers/2210.17432\"\n }\n ]\n }\n ],\n \"durationSeconds\": 0.7180289310053922\n}",
40
+ "status": "completed"
41
+ },
42
+ {
43
+ "type": "message",
44
+ "id": "msg_2",
45
+ "role": "assistant",
46
+ "status": "completed",
47
+ "content": [
48
+ {
49
+ "type": "output_text",
50
+ "text": "**Diffusion language models (DLMs) have shifted from research curiosity to viable AR alternatives.** Google's **DiffusionGemma-26B** now matches autoregressive quality at scale, demonstrating that discrete diffusion can compete on standard benchmarks.\n\n**Few-step decoding has become practical.** Methods like **T3D** (trajectory self-distillation) and **IDLM** (inverse distillation) collapsed sampling from hundreds of steps to 4\u20138 while preserving quality. This removes the main adoption barrier\u2014DLMs are no longer prohibitively slow at inference.\n\n**Hybrid \"plan-and-fill\" architectures are now dominant.** Models like **CtrlDiff** and **FS-DFM** use autoregressive skeleton planning followed by parallel diffusion-based token filling. This semi-autoregressive approach captures AR coherence benefits with DLM parallelization speedups.\n\n**Controllability is emerging as the killer feature.** Bidirectional attention in DLMs enables fine-grained control over length, style, and structure mid-generation\u2014something causal AR models cannot easily do. Recent frameworks like **S\u00b3** exploit this for structured outputs and prompt adherence.\n\n**Reasoning capabilities are being unlocked.** Papers such as **d\u00b2** and the energy-based diffusion LLM show that with policy-gradient training and energy-based formulations, DLMs can match or exceed AR models on math and logical reasoning tasks, challenging the assumption that autoregression is necessary for complex reasoning.\n\n**Relevant Hugging Face pages:**\n- Model: [google/diffusiongemma-26B-A4B-it](https://huggingface.co/google/diffusiongemma-26B-A4B-it)\n- Paper: [T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation](https://huggingface.co/papers/2602.12262)"
51
+ }
52
+ ]
53
+ }
54
+ ],
55
+ "error": null,
56
+ "incomplete_details": null,
57
+ "usage": {
58
+ "session_id": "7f8b617e-640d-4aa8-a8df-1402037b7c71",
59
+ "total_usd": 0.0,
60
+ "inference_usd": 0.0,
61
+ "hf_jobs_estimated_usd": 0.0,
62
+ "sandbox_estimated_usd": 0.0,
63
+ "llm_calls": 3,
64
+ "hf_jobs_count": 0,
65
+ "sandbox_count": 0,
66
+ "prompt_tokens": 48470,
67
+ "completion_tokens": 1131,
68
+ "cache_read_tokens": 0,
69
+ "cache_creation_tokens": 0,
70
+ "total_tokens": 49601,
71
+ "hf_jobs_billable_seconds_estimate": 0,
72
+ "sandbox_billable_seconds_estimate": 0
73
+ },
74
+ "artifacts": [],
75
+ "metadata": {
76
+ "example_slug": "diffusion-language-models"
77
+ }
78
+ }
79
+ }
api-docs/index.html CHANGED
@@ -183,6 +183,23 @@ td:first-child code { white-space: nowrap; }
183
  }
184
  .callout b { color: var(--ink); }
185
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
186
  /* ── terminal (animated example) ───────────────────── */
187
  .term { background: var(--terminal); border: 1px solid var(--terminal-edge); border-radius: 10px; overflow: hidden; margin: 18px 0 6px; }
188
  .term-bar { display: flex; align-items: center; gap: 6px; padding: 10px 14px; border-bottom: 1px solid var(--terminal-edge); }
@@ -218,6 +235,7 @@ td:first-child code { white-space: nowrap; }
218
  <h5>Getting started</h5>
219
  <a href="#overview">Overview</a>
220
  <a href="#authentication">Authentication</a>
 
221
  <a href="#conventions">Conventions</a>
222
  <h5>Endpoints</h5>
223
  <a href="#create"><span class="m post">POST</span>/responses</a>
@@ -300,6 +318,33 @@ td:first-child code { white-space: nowrap; }
300
  lifetime only.</p>
301
  </section>
302
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
303
  <!-- ════════ CONVENTIONS ════════ -->
304
  <section id="conventions">
305
  <h2>Conventions <a class="anchor" href="#conventions">#</a></h2>
 
183
  }
184
  .callout b { color: var(--ink); }
185
 
186
+ /* examples */
187
+ .examples { display: grid; gap: 14px; margin-top: 14px; }
188
+ .example {
189
+ background: #fff; border: 1px solid var(--line); border-radius: 8px;
190
+ padding: 14px 16px;
191
+ }
192
+ .example h3 { margin: 0 0 6px; }
193
+ .example .meta {
194
+ display: flex; flex-wrap: wrap; gap: 6px; margin: 8px 0 10px;
195
+ }
196
+ .example .meta code {
197
+ border: 1px solid var(--line-soft); background: var(--bg-deep);
198
+ font-size: 11.5px; padding: 2px 7px; border-radius: 5px;
199
+ }
200
+ .example p { font-size: 13.5px; }
201
+ .example pre { white-space: pre-wrap; word-break: break-word; }
202
+
203
  /* ── terminal (animated example) ───────────────────── */
204
  .term { background: var(--terminal); border: 1px solid var(--terminal-edge); border-radius: 10px; overflow: hidden; margin: 18px 0 6px; }
205
  .term-bar { display: flex; align-items: center; gap: 6px; padding: 10px 14px; border-bottom: 1px solid var(--terminal-edge); }
 
235
  <h5>Getting started</h5>
236
  <a href="#overview">Overview</a>
237
  <a href="#authentication">Authentication</a>
238
+ <a href="#examples">Examples</a>
239
  <a href="#conventions">Conventions</a>
240
  <h5>Endpoints</h5>
241
  <a href="#create"><span class="m post">POST</span>/responses</a>
 
318
  lifetime only.</p>
319
  </section>
320
 
321
+ <!-- ════════ EXAMPLES ════════ -->
322
+ <section id="examples">
323
+ <h2>Examples <a class="anchor" href="#examples">#</a></h2>
324
+ <p>Verified against the public Space endpoint. Each example uses
325
+ <code>background: true</code>, then polls <code>GET /v1/responses/{id}</code> until terminal status.</p>
326
+
327
+ <div class="examples">
328
+ <article class="example">
329
+ <h3>Research a cutting-edge concept</h3>
330
+ <div class="meta">
331
+ <code>completed</code><code>43 s</code><code>hf_papers</code><code>resp_bf64687c764f4d509a134188390a2236</code>
332
+ </div>
333
+ <p><b>Task:</b> Research diffusion language models for text generation; explain recent changes and cite Hugging Face paper/model pages.</p>
334
+ <div class="codebox"><span class="lang">request</span><button class="copy" type="button">copy</button>
335
+ <pre>{
336
+ "model": "moonshotai/Kimi-K2.6:novita",
337
+ "input": "Research diffusion language models for text generation. In 5 concise bullets, explain what changed recently, why it matters, and cite 2 relevant Hugging Face paper pages or model pages if available. Keep under 300 words.",
338
+ "background": true,
339
+ "max_cost_usd": 3.0
340
+ }</pre></div>
341
+ <p><b>Result:</b> The agent found recent DLM work, identified few-step decoding and hybrid plan-and-fill architectures as key shifts, and cited
342
+ <a href="https://huggingface.co/google/diffusiongemma-26B-A4B-it" target="_blank" rel="noopener">google/diffusiongemma-26B-A4B-it</a> plus
343
+ <a href="https://huggingface.co/papers/2602.12262" target="_blank" rel="noopener">T3D on HF Papers</a>.</p>
344
+ </article>
345
+ </div>
346
+ </section>
347
+
348
  <!-- ════════ CONVENTIONS ════════ -->
349
  <section id="conventions">
350
  <h2>Conventions <a class="anchor" href="#conventions">#</a></h2>