finephrase

Running on CPU Upgrade

joelniklaus HF Staff commited on Feb 20

Commit

b531a96

1 Parent(s): 526247c

remove visualization from toc

Files changed (2) hide show

app/src/content/chapters/5-infrastructure.mdx CHANGED Viewed

@@ -444,6 +444,9 @@ python examples/inference/benchmark/generate_data.py \
 With a trillion-parameter model you won't be generating billions of tokens per hour, but you don't need to. A few thousand high-quality reasoning traces from a frontier model can be worth more than millions of tokens from a smaller one.
 To get an intuition for what these throughput numbers feel like, <FigRef target="inference-throughput" /> lets you pick a model and scale up the number of GPUs. Each page represents roughly 500 tokens of generated text. At high enough throughput, pages roll up into books (200 pages each).
 <HtmlEmbed

 With a trillion-parameter model you won't be generating billions of tokens per hour, but you don't need to. A few thousand high-quality reasoning traces from a frontier model can be worth more than millions of tokens from a smaller one.
+#### Visualizing Throughput
 To get an intuition for what these throughput numbers feel like, <FigRef target="inference-throughput" /> lets you pick a model and scale up the number of GPUs. Each page represents roughly 500 tokens of generated text. At high enough throughput, pages roll up into books (200 pages each).
 <HtmlEmbed

app/src/content/embeds/inference-throughput.html CHANGED Viewed

@@ -6,7 +6,6 @@
 <style>
   * { margin: 0; padding: 0; box-sizing: border-box; }
   body { background: #fff; font-family: system-ui, sans-serif; display: flex; flex-direction: column; align-items: center; padding: 30px; }
-  h2 { margin-bottom: 4px; }
   .subtitle { color: #666; margin-bottom: 20px; font-size: 14px; }
   .controls { display: flex; gap: 24px; align-items: flex-end; margin-bottom: 16px; flex-wrap: wrap; justify-content: center; }
   .control-group { display: flex; flex-direction: column; gap: 4px; }
@@ -23,7 +22,6 @@
 </style>
 </head>
 <body>
-<h2>GPU Page Generator</h2>
 <div class="subtitle">1 page ≈ 500 tokens ≈ 1,800 characters</div>
 <div class="controls">

 <style>
   * { margin: 0; padding: 0; box-sizing: border-box; }
   body { background: #fff; font-family: system-ui, sans-serif; display: flex; flex-direction: column; align-items: center; padding: 30px; }
   .subtitle { color: #666; margin-bottom: 20px; font-size: 14px; }
   .controls { display: flex; gap: 24px; align-items: flex-end; margin-bottom: 16px; flex-wrap: wrap; justify-content: center; }
   .control-group { display: flex; flex-direction: column; gap: 4px; }
 </style>
 </head>
 <body>
 <div class="subtitle">1 page ≈ 500 tokens ≈ 1,800 characters</div>
 <div class="controls">