Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -7,8 +7,8 @@ tags:
 - diffusion
 - efficiency
 - flash-decoding
-- qwen
 - diffusion-language-model
 ---
 # gpt-oss-20b-DFlash
@@ -106,7 +106,7 @@ We use a **block size of 8 (7 draft tokens)** during speculation. DFlash consist
 The numbers reported are end-to-end speedup (including prefill time). You can specify different block size during inference by passing `--speculative-num-draft-tokens` arguments when launch the server.
-The reasoning effort is set to medium for all tasks. Low reasoning effort will give even higher acceptance length.
 |                | Math500 | GSM8K | HumanEval | MT-Bench |
 |----------------|----------|--------|------------|-----------|

 - diffusion
 - efficiency
 - flash-decoding
 - diffusion-language-model
+- gpt-oss
 ---
 # gpt-oss-20b-DFlash
 The numbers reported are end-to-end speedup (including prefill time). You can specify different block size during inference by passing `--speculative-num-draft-tokens` arguments when launch the server.
+The reasoning effort is set to **medium** for all tasks. Low reasoning effort will give even higher acceptance length.
 |                | Math500 | GSM8K | HumanEval | MT-Bench |
 |----------------|----------|--------|------------|-----------|