WilhelmT commited on
Commit
ad0382d
·
verified ·
1 Parent(s): 9f6e041

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -48,21 +48,21 @@ FlashHead matches the baseline **Llama-3.2-3B** within rounding on standard eval
48
 
49
  | **Precision** | **Tokens/sec** | **Speedup vs BF16** |
50
  |----------------|----------------|----------------------|
51
- | BF16 baseline | 130 | 1.0× |
52
- | **FlashHead (Embedl)** | **163** | **1.25×** |
53
- | W4A16 baseline | 278 | 2.14× |
54
- | **FlashHead W4A16 (Embedl)** | **485** | **3.73×** |
55
 
56
- FlashHead improves end-to-end speed by **1.75×** over state-of-the-art, while maintaining full accuracy parity.
57
 
58
  ---
59
 
60
  ## Accuracy (Parity with Baseline)
61
 
62
- | **Method** | **MMLU-Pro** | **HellaSwag** | **IFEval** | **BoolQ** | **BBH** | **TruthfulQA** | **GSM8K** |
63
  |-------------|---------------|----------------|--------------|-------------|-------------|----------------|--------------|
64
- | **Baseline** | 0.18 | 0.59 | 0.45 | 0.69 | 0.38 | 0.36 | 0.46 |
65
- | **FlashHead** | 0.18 | 0.59 | 0.45 | 0.69 | 0.38 | 0.36 | 0.46 |
66
 
67
  FlashHead matches baseline performance exactly across all evaluation benchmarks.
68
 
 
48
 
49
  | **Precision** | **Tokens/sec** | **Speedup vs BF16** |
50
  |----------------|----------------|----------------------|
51
+ | BF16 baseline | 54 | 1.0× |
52
+ | **FlashHead (Embedl)** | **58** | **1.07×** |
53
+ | W4A16 baseline | 141 | 2.61× |
54
+ | **FlashHead W4A16 (Embedl)** | **177** | **3.28×** |
55
 
56
+ FlashHead improves end-to-end speed by **1.26×** over state-of-the-art, while maintaining full accuracy parity.
57
 
58
  ---
59
 
60
  ## Accuracy (Parity with Baseline)
61
 
62
+ | **Method** | **MMLU-Pro** | **IFEval** | **BBH** | **TruthfulQA** | **GSM8K** |
63
  |-------------|---------------|----------------|--------------|-------------|-------------|----------------|--------------|
64
+ | **Baseline** | 0.31 | 0.57 | 0.57 | 0.57 | 0.77 |
65
+ | **FlashHead** | 0.31 | 0.56 | 0.57 | 0.58 | 0.77 |
66
 
67
  FlashHead matches baseline performance exactly across all evaluation benchmarks.
68