tompoper commited on
Commit
93410e0
·
verified ·
1 Parent(s): 3f24d2b

Fix critical-path bandwidth figure: naive is 9.00 MB/token (not 16.50); 9.00/4.50=2.00x

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -57,7 +57,7 @@ Rust.
57
  |---|---|
58
  | CPU decode throughput (~31B / 16-layer, Q4, 32 threads) | **5.94 tok/s** |
59
  | Effective memory bandwidth | 61 GB/s (30% of 204.8 GB/s peak) |
60
- | Bandwidth reduction from pipelining | **2.00x** (16.50 → 4.50 MB/token) |
61
  | Test perplexity (114M, TinyStories, 10K steps) | 6.50 |
62
  | Val perplexity (8.34B / 4-layer, TinyStories, 10K steps) | 4.52 |
63
 
 
57
  |---|---|
58
  | CPU decode throughput (~31B / 16-layer, Q4, 32 threads) | **5.94 tok/s** |
59
  | Effective memory bandwidth | 61 GB/s (30% of 204.8 GB/s peak) |
60
+ | Bandwidth reduction from pipelining | **2.00x** (9.00 → 4.50 MB/token) |
61
  | Test perplexity (114M, TinyStories, 10K steps) | 6.50 |
62
  | Val perplexity (8.34B / 4-layer, TinyStories, 10K steps) | 4.52 |
63