Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,31 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Scaffold-and-Fill Diffusion (SF-Diff): A Hybrid Architecture for Accelerated Language Model Inference
|
| 6 |
+
|
| 7 |
+
**Author:** Hilal Limo (Self-Taught Independent Researcher, Age 15)
|
| 8 |
+
|
| 9 |
+
**[➡️ Click here to read the full paper: SF-Diff_Paper.pdf](SF-Diff_Paper.pdf)**
|
| 10 |
+
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
## Abstract
|
| 14 |
+
|
| 15 |
+
Autoregressive transformer models, the dominant architecture for modern Large Language Models (LLMs), are fundamentally constrained by high inference latency due to their sequential generation process. In this paper, I propose Scaffold-and-Fill Diffusion (SF-Diff), a novel hybrid architecture designed to significantly accelerate text generation by deconstructing the task into two parallelizable stages. The core hypothesis is that natural language can be separated into a semantic "scaffolding" of keywords and a grammatical "filler" of structural words. SF-Diff first utilizes a non-autoregressive diffusion model to generate the complete semantic scaffold—a sequence of keyword vector embeddings—in a fixed number of highly parallelizable steps. Subsequently, a lightweight autoregressive transformer decoder performs a "grammatical infilling" task, weaving the structural words around the pre-generated semantic core. This approach aims to combine the holistic, parallel generation strengths of diffusion models with the grammatical precision of transformers, offering a substantial reduction in inference latency while maintaining high-quality, coherent output.
|
| 16 |
+
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
## Citation
|
| 20 |
+
|
| 21 |
+
If you find this work interesting, please consider citing the paper:
|
| 22 |
+
|
| 23 |
+
```bibtex
|
| 24 |
+
@misc{limo2025sfdiff,
|
| 25 |
+
author = {Hilal Limo},
|
| 26 |
+
title = {Scaffold-and-Fill Diffusion (SF-Diff): A Hybrid Architecture for Accelerated Language Model Inference},
|
| 27 |
+
year = {2025},
|
| 28 |
+
publisher = {Hugging Face},
|
| 29 |
+
journal = {Hugging Face Hub},
|
| 30 |
+
howpublished = {\url{https://huggingface.co/TimesLast/SF-Diff}}
|
| 31 |
+
}
|