Spaces:
Running
Running
Add diffusion language model API example
Browse files- README.md +3 -0
- api-docs/example-results/diffusion-language-models.json +79 -0
- api-docs/index.html +45 -0
README.md
CHANGED
|
@@ -20,6 +20,9 @@ results + artifacts (trackio dashboards, jobs, models) over resumable SSE.
|
|
| 20 |
The Space root serves the interactive API documentation. Endpoints live
|
| 21 |
under `/v1` β start at `POST /v1/responses`.
|
| 22 |
|
|
|
|
|
|
|
|
|
|
| 23 |
Source: https://github.com/huggingface/ml-intern (see `docs/API.md`).
|
| 24 |
|
| 25 |
Optional Space secrets:
|
|
|
|
| 20 |
The Space root serves the interactive API documentation. Endpoints live
|
| 21 |
under `/v1` β start at `POST /v1/responses`.
|
| 22 |
|
| 23 |
+
The docs include verified public-endpoint examples, starting with short
|
| 24 |
+
research/model-selection tasks and extending to longer artifact-producing runs.
|
| 25 |
+
|
| 26 |
Source: https://github.com/huggingface/ml-intern (see `docs/API.md`).
|
| 27 |
|
| 28 |
Optional Space secrets:
|
api-docs/example-results/diffusion-language-models.json
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"request": {
|
| 3 |
+
"slug": "diffusion-language-models",
|
| 4 |
+
"title": "Research a cutting-edge concept",
|
| 5 |
+
"model": "moonshotai/Kimi-K2.6:novita",
|
| 6 |
+
"input": "Research diffusion language models for text generation. In 5 concise bullets, explain what changed recently, why it matters, and cite 2 relevant Hugging Face paper pages or model pages if available. Keep under 300 words.",
|
| 7 |
+
"background": true,
|
| 8 |
+
"max_cost_usd": 3.0,
|
| 9 |
+
"metadata": {
|
| 10 |
+
"example_slug": "diffusion-language-models"
|
| 11 |
+
}
|
| 12 |
+
},
|
| 13 |
+
"response": {
|
| 14 |
+
"id": "resp_bf64687c764f4d509a134188390a2236",
|
| 15 |
+
"object": "response",
|
| 16 |
+
"created_at": 1781313817,
|
| 17 |
+
"completed_at": 1781313849,
|
| 18 |
+
"status": "completed",
|
| 19 |
+
"model": "moonshotai/Kimi-K2.6:novita",
|
| 20 |
+
"background": true,
|
| 21 |
+
"previous_response_id": null,
|
| 22 |
+
"session_id": "7f8b617e-640d-4aa8-a8df-1402037b7c71",
|
| 23 |
+
"max_cost_usd": 3.0,
|
| 24 |
+
"instructions": null,
|
| 25 |
+
"output": [
|
| 26 |
+
{
|
| 27 |
+
"type": "custom_tool_call",
|
| 28 |
+
"id": "functions.hf_papers:0",
|
| 29 |
+
"name": "hf_papers",
|
| 30 |
+
"input": "{\"operation\": \"search\", \"query\": \"diffusion language models text generation\", \"date_from\": \"2024-01-01\", \"limit\": 15}",
|
| 31 |
+
"output": "# Papers matching 'diffusion language models text generation'\nShowing 15 paper(s)\n\n## 1. Energy-Based Diffusion Language Models for Text Generation\n**arxiv_id:** 2410.21357 | **upvotes:** 3\nhttps://huggingface.co/papers/2410.21357\n**Keywords:** discrete diffusion models, parallel generation, Energy-based Diffusion Language Model, energy-based model, residual form\n**GitHub:** https://github.com/MinkaiXu/Energy-Diffusion-LLM (69 stars)\n**Summary:** Energy-based Diffusion Language Model improves diffusion-based generation by using an energy-based model with parameter optimization from autoregressive models and achieves faster sampling speeds with comparable performance to autoregressive models.\n\n## 2. Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models\n**arxiv_id:** 2604.02560 | **upvotes:** 0\nhttps://huggingface.co/papers/2604.02560\n**Keywords:** discrete diffusion language models, parallel decoding, distributional mismatch, joint conditional, per-token marginals\n**Summary:** Discrete diffusion language models accelerate text generation by parallel token unmasking, but this introduces distributional mismatch; a dependency predictor called DEMASK addresses this by identifying bounded cumulative dependency positions for simultaneous unmasking under sub-additivity assumptio...\n\n## 3. DPad: Efficient Diffusion Language Models with Suffix Dropout\n**arxiv_id:** 2508.14148 | **upvotes:** 0\nhttps://huggingface.co/papers/2508.14148\n**Keywords:** diffusion-based large language models, denoising process, attention, sliding window, distance-decay dropout\n**GitHub:** https://github.com/Crys-Chen/DPad (62 stars)\n**Summary:** Diffusion Scratchpad (DPad) enhances the efficiency of diffusion-based large language models by restricting attention to nearby tokens, achieving significant speedup without compromising accuracy.\n\n## 4. Unveiling the Potential of Diffusion Large Language Model in\n Controllable Generation\n**arxiv_id:** 2507.04504 | **upvotes:** 1\nhttps://huggingface.co/papers/2507.04504\n**Keywords:** diffusion models, autoregressive large language models, dLLMs, bidirectional attention mechanism, controllable generation\n**Summary:** A novel framework, S<sup>3</sup>, enhances diffusion language models for controllable text generation by improving context modeling, reducing hallucinations, and accelerating inference.\n\n## 5. FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion\n Language Models\n**arxiv_id:** 2509.20624 | **upvotes:** 1\nhttps://huggingface.co/papers/2509.20624\n**Summary:** Autoregressive language models (ARMs) deliver strong likelihoods, but are\ninherently serial: they generate one token per forward pass, which limits\nthroughput and inflates latency for long sequences. Diffusion Language Models\n(DLMs) parallelize across positions and thus appear promising for language...\n\n## 6. Text Generation with Diffusion Language Models: A Pre-training Approach\n with Continuous Paragraph Denoise\n**arxiv_id:** 2212.11685 | **upvotes:** 2\nhttps://huggingface.co/papers/2212.11685\n**Keywords:** diffusion language model, encoder, diffusion-based decoder, continuous paragraph denoise objective, semantic coherence\n**Summary:** GENIE, a diffusion-based language model with an encoder and diffusion decoder, achieves state-of-the-art performance in text generation by using a continuous paragraph denoise objective.\n\n## 7. A Survey on Parallel Text Generation: From Parallel Decoding to\n Diffusion Language Models\n**arxiv_id:** 2508.08712 | **upvotes:** 0\nhttps://huggingface.co/papers/2508.08712\n**Keywords:** Large Language Models, autoregressive, parallel text generation, token-by-token, inference efficiency\n**GitHub:** https://github.com/zhanglingzhe0820/Awesome-Parallel-Text-Generation (43 stars)\n**Summary:** A survey of parallel text generation methods categorizes and analyzes AR-based and Non-AR-based techniques, assessing their trade-offs and potential for improving inference efficiency in LLMs.\n\n## 8. CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block\n Prediction and Controllabl\u2026 [truncated 4109 chars]",
|
| 32 |
+
"status": "completed"
|
| 33 |
+
},
|
| 34 |
+
{
|
| 35 |
+
"type": "custom_tool_call",
|
| 36 |
+
"id": "functions.web_search:1",
|
| 37 |
+
"name": "web_search",
|
| 38 |
+
"input": "{\"query\": \"site:huggingface.co diffusion language model text generation\"}",
|
| 39 |
+
"output": "{\n \"query\": \"site:huggingface.co diffusion language model text generation\",\n \"results\": [\n \"Search results for 'site:huggingface.co diffusion language model text generation'. Include a Sources section in the final answer.\\n- [Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion ...](https://huggingface.co/blog/nvidia/nemotron-labs-diffusion)\\n- [Diffusion Language Models: The New Paradigm - Hugging Face](https://huggingface.co/blog/ProCreations/diffusion-language-model)\\n- [Text diffusion - a stereoplegic Collection - Hugging Face](https://huggingface.co/collections/stereoplegic/text-diffusion)\\n- [google/diffusiongemma-26B-A4B-it \\u00b7 Hugging Face](https://huggingface.co/google/diffusiongemma-26B-A4B-it)\\n- [Paper page - Diffusion-LM Improves Controllable Text Generation](https://huggingface.co/papers/2205.14217)\\n- [Paper page - Text Generation with Diffusion Language Models: A Pre ...](https://huggingface.co/papers/2212.11685)\\n- [Energy-Based Diffusion Language Models for Text Generation](https://huggingface.co/papers/2410.21357)\\n- [SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for ...](https://huggingface.co/papers/2210.17432)\",\n {\n \"tool_use_id\": \"functions.web_search:1\",\n \"content\": [\n {\n \"title\": \"Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion ...\",\n \"url\": \"https://huggingface.co/blog/nvidia/nemotron-labs-diffusion\"\n },\n {\n \"title\": \"Diffusion Language Models: The New Paradigm - Hugging Face\",\n \"url\": \"https://huggingface.co/blog/ProCreations/diffusion-language-model\"\n },\n {\n \"title\": \"Text diffusion - a stereoplegic Collection - Hugging Face\",\n \"url\": \"https://huggingface.co/collections/stereoplegic/text-diffusion\"\n },\n {\n \"title\": \"google/diffusiongemma-26B-A4B-it \\u00b7 Hugging Face\",\n \"url\": \"https://huggingface.co/google/diffusiongemma-26B-A4B-it\"\n },\n {\n \"title\": \"Paper page - Diffusion-LM Improves Controllable Text Generation\",\n \"url\": \"https://huggingface.co/papers/2205.14217\"\n },\n {\n \"title\": \"Paper page - Text Generation with Diffusion Language Models: A Pre ...\",\n \"url\": \"https://huggingface.co/papers/2212.11685\"\n },\n {\n \"title\": \"Energy-Based Diffusion Language Models for Text Generation\",\n \"url\": \"https://huggingface.co/papers/2410.21357\"\n },\n {\n \"title\": \"SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for ...\",\n \"url\": \"https://huggingface.co/papers/2210.17432\"\n }\n ]\n }\n ],\n \"durationSeconds\": 0.7180289310053922\n}",
|
| 40 |
+
"status": "completed"
|
| 41 |
+
},
|
| 42 |
+
{
|
| 43 |
+
"type": "message",
|
| 44 |
+
"id": "msg_2",
|
| 45 |
+
"role": "assistant",
|
| 46 |
+
"status": "completed",
|
| 47 |
+
"content": [
|
| 48 |
+
{
|
| 49 |
+
"type": "output_text",
|
| 50 |
+
"text": "**Diffusion language models (DLMs) have shifted from research curiosity to viable AR alternatives.** Google's **DiffusionGemma-26B** now matches autoregressive quality at scale, demonstrating that discrete diffusion can compete on standard benchmarks.\n\n**Few-step decoding has become practical.** Methods like **T3D** (trajectory self-distillation) and **IDLM** (inverse distillation) collapsed sampling from hundreds of steps to 4\u20138 while preserving quality. This removes the main adoption barrier\u2014DLMs are no longer prohibitively slow at inference.\n\n**Hybrid \"plan-and-fill\" architectures are now dominant.** Models like **CtrlDiff** and **FS-DFM** use autoregressive skeleton planning followed by parallel diffusion-based token filling. This semi-autoregressive approach captures AR coherence benefits with DLM parallelization speedups.\n\n**Controllability is emerging as the killer feature.** Bidirectional attention in DLMs enables fine-grained control over length, style, and structure mid-generation\u2014something causal AR models cannot easily do. Recent frameworks like **S\u00b3** exploit this for structured outputs and prompt adherence.\n\n**Reasoning capabilities are being unlocked.** Papers such as **d\u00b2** and the energy-based diffusion LLM show that with policy-gradient training and energy-based formulations, DLMs can match or exceed AR models on math and logical reasoning tasks, challenging the assumption that autoregression is necessary for complex reasoning.\n\n**Relevant Hugging Face pages:**\n- Model: [google/diffusiongemma-26B-A4B-it](https://huggingface.co/google/diffusiongemma-26B-A4B-it)\n- Paper: [T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation](https://huggingface.co/papers/2602.12262)"
|
| 51 |
+
}
|
| 52 |
+
]
|
| 53 |
+
}
|
| 54 |
+
],
|
| 55 |
+
"error": null,
|
| 56 |
+
"incomplete_details": null,
|
| 57 |
+
"usage": {
|
| 58 |
+
"session_id": "7f8b617e-640d-4aa8-a8df-1402037b7c71",
|
| 59 |
+
"total_usd": 0.0,
|
| 60 |
+
"inference_usd": 0.0,
|
| 61 |
+
"hf_jobs_estimated_usd": 0.0,
|
| 62 |
+
"sandbox_estimated_usd": 0.0,
|
| 63 |
+
"llm_calls": 3,
|
| 64 |
+
"hf_jobs_count": 0,
|
| 65 |
+
"sandbox_count": 0,
|
| 66 |
+
"prompt_tokens": 48470,
|
| 67 |
+
"completion_tokens": 1131,
|
| 68 |
+
"cache_read_tokens": 0,
|
| 69 |
+
"cache_creation_tokens": 0,
|
| 70 |
+
"total_tokens": 49601,
|
| 71 |
+
"hf_jobs_billable_seconds_estimate": 0,
|
| 72 |
+
"sandbox_billable_seconds_estimate": 0
|
| 73 |
+
},
|
| 74 |
+
"artifacts": [],
|
| 75 |
+
"metadata": {
|
| 76 |
+
"example_slug": "diffusion-language-models"
|
| 77 |
+
}
|
| 78 |
+
}
|
| 79 |
+
}
|
api-docs/index.html
CHANGED
|
@@ -183,6 +183,23 @@ td:first-child code { white-space: nowrap; }
|
|
| 183 |
}
|
| 184 |
.callout b { color: var(--ink); }
|
| 185 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 186 |
/* ββ terminal (animated example) βββββββββββββββββββββ */
|
| 187 |
.term { background: var(--terminal); border: 1px solid var(--terminal-edge); border-radius: 10px; overflow: hidden; margin: 18px 0 6px; }
|
| 188 |
.term-bar { display: flex; align-items: center; gap: 6px; padding: 10px 14px; border-bottom: 1px solid var(--terminal-edge); }
|
|
@@ -218,6 +235,7 @@ td:first-child code { white-space: nowrap; }
|
|
| 218 |
<h5>Getting started</h5>
|
| 219 |
<a href="#overview">Overview</a>
|
| 220 |
<a href="#authentication">Authentication</a>
|
|
|
|
| 221 |
<a href="#conventions">Conventions</a>
|
| 222 |
<h5>Endpoints</h5>
|
| 223 |
<a href="#create"><span class="m post">POST</span>/responses</a>
|
|
@@ -300,6 +318,33 @@ td:first-child code { white-space: nowrap; }
|
|
| 300 |
lifetime only.</p>
|
| 301 |
</section>
|
| 302 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 303 |
<!-- ββββββββ CONVENTIONS ββββββββ -->
|
| 304 |
<section id="conventions">
|
| 305 |
<h2>Conventions <a class="anchor" href="#conventions">#</a></h2>
|
|
|
|
| 183 |
}
|
| 184 |
.callout b { color: var(--ink); }
|
| 185 |
|
| 186 |
+
/* examples */
|
| 187 |
+
.examples { display: grid; gap: 14px; margin-top: 14px; }
|
| 188 |
+
.example {
|
| 189 |
+
background: #fff; border: 1px solid var(--line); border-radius: 8px;
|
| 190 |
+
padding: 14px 16px;
|
| 191 |
+
}
|
| 192 |
+
.example h3 { margin: 0 0 6px; }
|
| 193 |
+
.example .meta {
|
| 194 |
+
display: flex; flex-wrap: wrap; gap: 6px; margin: 8px 0 10px;
|
| 195 |
+
}
|
| 196 |
+
.example .meta code {
|
| 197 |
+
border: 1px solid var(--line-soft); background: var(--bg-deep);
|
| 198 |
+
font-size: 11.5px; padding: 2px 7px; border-radius: 5px;
|
| 199 |
+
}
|
| 200 |
+
.example p { font-size: 13.5px; }
|
| 201 |
+
.example pre { white-space: pre-wrap; word-break: break-word; }
|
| 202 |
+
|
| 203 |
/* ββ terminal (animated example) βββββββββββββββββββββ */
|
| 204 |
.term { background: var(--terminal); border: 1px solid var(--terminal-edge); border-radius: 10px; overflow: hidden; margin: 18px 0 6px; }
|
| 205 |
.term-bar { display: flex; align-items: center; gap: 6px; padding: 10px 14px; border-bottom: 1px solid var(--terminal-edge); }
|
|
|
|
| 235 |
<h5>Getting started</h5>
|
| 236 |
<a href="#overview">Overview</a>
|
| 237 |
<a href="#authentication">Authentication</a>
|
| 238 |
+
<a href="#examples">Examples</a>
|
| 239 |
<a href="#conventions">Conventions</a>
|
| 240 |
<h5>Endpoints</h5>
|
| 241 |
<a href="#create"><span class="m post">POST</span>/responses</a>
|
|
|
|
| 318 |
lifetime only.</p>
|
| 319 |
</section>
|
| 320 |
|
| 321 |
+
<!-- ββββββββ EXAMPLES ββββββββ -->
|
| 322 |
+
<section id="examples">
|
| 323 |
+
<h2>Examples <a class="anchor" href="#examples">#</a></h2>
|
| 324 |
+
<p>Verified against the public Space endpoint. Each example uses
|
| 325 |
+
<code>background: true</code>, then polls <code>GET /v1/responses/{id}</code> until terminal status.</p>
|
| 326 |
+
|
| 327 |
+
<div class="examples">
|
| 328 |
+
<article class="example">
|
| 329 |
+
<h3>Research a cutting-edge concept</h3>
|
| 330 |
+
<div class="meta">
|
| 331 |
+
<code>completed</code><code>43 s</code><code>hf_papers</code><code>resp_bf64687c764f4d509a134188390a2236</code>
|
| 332 |
+
</div>
|
| 333 |
+
<p><b>Task:</b> Research diffusion language models for text generation; explain recent changes and cite Hugging Face paper/model pages.</p>
|
| 334 |
+
<div class="codebox"><span class="lang">request</span><button class="copy" type="button">copy</button>
|
| 335 |
+
<pre>{
|
| 336 |
+
"model": "moonshotai/Kimi-K2.6:novita",
|
| 337 |
+
"input": "Research diffusion language models for text generation. In 5 concise bullets, explain what changed recently, why it matters, and cite 2 relevant Hugging Face paper pages or model pages if available. Keep under 300 words.",
|
| 338 |
+
"background": true,
|
| 339 |
+
"max_cost_usd": 3.0
|
| 340 |
+
}</pre></div>
|
| 341 |
+
<p><b>Result:</b> The agent found recent DLM work, identified few-step decoding and hybrid plan-and-fill architectures as key shifts, and cited
|
| 342 |
+
<a href="https://huggingface.co/google/diffusiongemma-26B-A4B-it" target="_blank" rel="noopener">google/diffusiongemma-26B-A4B-it</a> plus
|
| 343 |
+
<a href="https://huggingface.co/papers/2602.12262" target="_blank" rel="noopener">T3D on HF Papers</a>.</p>
|
| 344 |
+
</article>
|
| 345 |
+
</div>
|
| 346 |
+
</section>
|
| 347 |
+
|
| 348 |
<!-- ββββββββ CONVENTIONS ββββββββ -->
|
| 349 |
<section id="conventions">
|
| 350 |
<h2>Conventions <a class="anchor" href="#conventions">#</a></h2>
|