d3LLM-model commited on
Commit
5beaf72
Β·
verified Β·
1 Parent(s): 098382b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ pipeline_tag: text-generation
18
 
19
  ## Key Features
20
 
21
- - πŸš€ High throughput: **5.0Γ— faster** than autoregressive models (Qwen-2.5-7B-it) on H100 GPU, **3.5Γ— faster** on A100 GPU. Achieves **288.73 tokens/s** on H100 (vs 57.32 for AR baseline).
22
  - πŸ“Š High AUP (Accuracy Under Parallelism) scores across benchmarks
23
  - πŸ”§ Optimized for coding and math reasoning tasks
24
 
 
18
 
19
  ## Key Features
20
 
21
+ - πŸš€ High throughput: **5.0Γ— faster** than autoregressive models (Qwen-2.5-7B-it) on H100 GPU, **3.5Γ— faster** on A100 GPU. Achieves **288.73 tokens/s** on H100 (vs 57.32 for AR baseline) on GSM8K-CoT Dataset.
22
  - πŸ“Š High AUP (Accuracy Under Parallelism) scores across benchmarks
23
  - πŸ”§ Optimized for coding and math reasoning tasks
24