Update README.md
Browse files
README.md
CHANGED
|
@@ -16,8 +16,6 @@ license: other
|
|
| 16 |
|
| 17 |
# FS-DFM (Few-Step Discrete Flow-Matching)
|
| 18 |
|
| 19 |
-
This repository provides **FS-DFM checkpoints** from the paper:
|
| 20 |
-
|
| 21 |
**FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Model**
|
| 22 |
Amin Karimi Monsefi, Nikhil Bhendawade, Manuel R. Ciosici, Dominic Culver, Yizhe Zhang, Irina Belousova (Jan 9, 2026)
|
| 23 |
ArXiv: 2509.20624
|
|
@@ -31,7 +29,7 @@ FS-DFM is a **token-space diffusion / flow-matching language model** designed fo
|
|
| 31 |
|
| 32 |
### Checkpoint files
|
| 33 |
- [`FS_DFM_checkpoint.pth`](FS_DFM_checkpoint.pth) — **FS-DFM 1.3B**, uniform source, **RK4 teacher distilled**
|
| 34 |
-
- `DFM_checkpoint.pth` — **DFM 1.3B**, uniform source, DFM pretrained initialization
|
| 35 |
|
| 36 |
|
| 37 |
---
|
|
@@ -78,18 +76,12 @@ Training/eval packing: documents packed into **1024-token** blocks (EOS appended
|
|
| 78 |
|
| 79 |
---
|
| 80 |
|
| 81 |
-
##
|
| 82 |
-
|
| 83 |
-
FS-DFM targets **long-horizon language modeling**. In the paper, **8-step sampling** is reported to reach **perplexity parity** with a **1024-step** discrete-flow baseline for **1024-token generation**, yielding up to **128× fewer model evaluations**.
|
| 84 |
-
|
| 85 |
-
> For exact numbers/plots and ablations, refer to the paper.
|
| 86 |
-
|
| 87 |
-
---
|
| 88 |
-
|
| 89 |
-
## How to use (recommended)
|
| 90 |
|
| 91 |
FS-DFM uses custom discrete solvers and is not a drop-in `transformers` model. The intended usage is via the official training/evaluation scripts.
|
| 92 |
|
|
|
|
|
|
|
| 93 |
### 1) Install the official code
|
| 94 |
```bash
|
| 95 |
git clone https://github.com/apple/ml-fs-dfm
|
|
|
|
| 16 |
|
| 17 |
# FS-DFM (Few-Step Discrete Flow-Matching)
|
| 18 |
|
|
|
|
|
|
|
| 19 |
**FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Model**
|
| 20 |
Amin Karimi Monsefi, Nikhil Bhendawade, Manuel R. Ciosici, Dominic Culver, Yizhe Zhang, Irina Belousova (Jan 9, 2026)
|
| 21 |
ArXiv: 2509.20624
|
|
|
|
| 29 |
|
| 30 |
### Checkpoint files
|
| 31 |
- [`FS_DFM_checkpoint.pth`](FS_DFM_checkpoint.pth) — **FS-DFM 1.3B**, uniform source, **RK4 teacher distilled**
|
| 32 |
+
- [`DFM_checkpoint.pth`](DFM_checkpoint.pth) — **DFM 1.3B**, uniform source, DFM pretrained initialization
|
| 33 |
|
| 34 |
|
| 35 |
---
|
|
|
|
| 76 |
|
| 77 |
---
|
| 78 |
|
| 79 |
+
## How to use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
|
| 81 |
FS-DFM uses custom discrete solvers and is not a drop-in `transformers` model. The intended usage is via the official training/evaluation scripts.
|
| 82 |
|
| 83 |
+
> PLEASE SEE [OUR OFFICIAL GITHUB](https://github.com/apple/ml-fs-dfm/tree/main)
|
| 84 |
+
|
| 85 |
### 1) Install the official code
|
| 86 |
```bash
|
| 87 |
git clone https://github.com/apple/ml-fs-dfm
|