Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
Commit ·
b531a96
1
Parent(s): 526247c
remove visualization from toc
Browse files
app/src/content/chapters/5-infrastructure.mdx
CHANGED
|
@@ -444,6 +444,9 @@ python examples/inference/benchmark/generate_data.py \
|
|
| 444 |
|
| 445 |
With a trillion-parameter model you won't be generating billions of tokens per hour, but you don't need to. A few thousand high-quality reasoning traces from a frontier model can be worth more than millions of tokens from a smaller one.
|
| 446 |
|
|
|
|
|
|
|
|
|
|
| 447 |
To get an intuition for what these throughput numbers feel like, <FigRef target="inference-throughput" /> lets you pick a model and scale up the number of GPUs. Each page represents roughly 500 tokens of generated text. At high enough throughput, pages roll up into books (200 pages each).
|
| 448 |
|
| 449 |
<HtmlEmbed
|
|
|
|
| 444 |
|
| 445 |
With a trillion-parameter model you won't be generating billions of tokens per hour, but you don't need to. A few thousand high-quality reasoning traces from a frontier model can be worth more than millions of tokens from a smaller one.
|
| 446 |
|
| 447 |
+
|
| 448 |
+
#### Visualizing Throughput
|
| 449 |
+
|
| 450 |
To get an intuition for what these throughput numbers feel like, <FigRef target="inference-throughput" /> lets you pick a model and scale up the number of GPUs. Each page represents roughly 500 tokens of generated text. At high enough throughput, pages roll up into books (200 pages each).
|
| 451 |
|
| 452 |
<HtmlEmbed
|
app/src/content/embeds/inference-throughput.html
CHANGED
|
@@ -6,7 +6,6 @@
|
|
| 6 |
<style>
|
| 7 |
* { margin: 0; padding: 0; box-sizing: border-box; }
|
| 8 |
body { background: #fff; font-family: system-ui, sans-serif; display: flex; flex-direction: column; align-items: center; padding: 30px; }
|
| 9 |
-
h2 { margin-bottom: 4px; }
|
| 10 |
.subtitle { color: #666; margin-bottom: 20px; font-size: 14px; }
|
| 11 |
.controls { display: flex; gap: 24px; align-items: flex-end; margin-bottom: 16px; flex-wrap: wrap; justify-content: center; }
|
| 12 |
.control-group { display: flex; flex-direction: column; gap: 4px; }
|
|
@@ -23,7 +22,6 @@
|
|
| 23 |
</style>
|
| 24 |
</head>
|
| 25 |
<body>
|
| 26 |
-
<h2>GPU Page Generator</h2>
|
| 27 |
<div class="subtitle">1 page ≈ 500 tokens ≈ 1,800 characters</div>
|
| 28 |
|
| 29 |
<div class="controls">
|
|
|
|
| 6 |
<style>
|
| 7 |
* { margin: 0; padding: 0; box-sizing: border-box; }
|
| 8 |
body { background: #fff; font-family: system-ui, sans-serif; display: flex; flex-direction: column; align-items: center; padding: 30px; }
|
|
|
|
| 9 |
.subtitle { color: #666; margin-bottom: 20px; font-size: 14px; }
|
| 10 |
.controls { display: flex; gap: 24px; align-items: flex-end; margin-bottom: 16px; flex-wrap: wrap; justify-content: center; }
|
| 11 |
.control-group { display: flex; flex-direction: column; gap: 4px; }
|
|
|
|
| 22 |
</style>
|
| 23 |
</head>
|
| 24 |
<body>
|
|
|
|
| 25 |
<div class="subtitle">1 page ≈ 500 tokens ≈ 1,800 characters</div>
|
| 26 |
|
| 27 |
<div class="controls">
|