d3LLM-model commited on
Commit
5beaf72
ยท
verified ยท
1 Parent(s): 098382b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ pipeline_tag: text-generation
18
 
19
  ## Key Features
20
 
21
- - ๐Ÿš€ High throughput: **5.0ร— faster** than autoregressive models (Qwen-2.5-7B-it) on H100 GPU, **3.5ร— faster** on A100 GPU. Achieves **288.73 tokens/s** on H100 (vs 57.32 for AR baseline).
22
  - ๐Ÿ“Š High AUP (Accuracy Under Parallelism) scores across benchmarks
23
  - ๐Ÿ”ง Optimized for coding and math reasoning tasks
24
 
 
18
 
19
  ## Key Features
20
 
21
+ - ๐Ÿš€ High throughput: **5.0ร— faster** than autoregressive models (Qwen-2.5-7B-it) on H100 GPU, **3.5ร— faster** on A100 GPU. Achieves **288.73 tokens/s** on H100 (vs 57.32 for AR baseline) on GSM8K-CoT Dataset.
22
  - ๐Ÿ“Š High AUP (Accuracy Under Parallelism) scores across benchmarks
23
  - ๐Ÿ”ง Optimized for coding and math reasoning tasks
24