joelniklaus HF Staff commited on
Commit
1c77cab
·
1 Parent(s): 62145f9

add baselines mixed with fw-edu-hq

Browse files
app/src/content/assets/data/benchmark-results.csv CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:27dd686263a9217a306811036fd361d7616dc6231393f311387d1b5dd065f595
3
- size 1334642
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0359f44cbbe97ee8f7ea598152a5053a322a81af818de890606e0daa6c15fd3a
3
+ size 1378100
app/src/content/chapters/3-experiments.mdx CHANGED
@@ -10,10 +10,17 @@ import FigRef from "../../components/FigRef.astro";
10
  {/* TODO: Integrate decay experiment as another analysis for proxy */}
11
  {/* TODO: share on a bunch of discords/slacks/hackernews/locallama */}
12
  {/* TODO: brainstorm better banner, be artsy */}
 
 
 
 
 
 
13
  {/* TODO: improve the diagram for the infrastructure at the start of the section */}
14
  {/* TODO: final configuration for finephrase at the end of infra section: visualization of how many pages (500 tokens) (use page emojis flying from left to right) we can generate (real time), user can configure with a slider the number of GPUs */}
15
  {/* TODO: only explain datatrove additions when we need them (for generating the final finephrase) */}
16
  {/* TODO: move infrastructure section after analyses as precursor and explanation for finephrase */}
 
17
 
18
  {/*
19
  Notes:
 
10
  {/* TODO: Integrate decay experiment as another analysis for proxy */}
11
  {/* TODO: share on a bunch of discords/slacks/hackernews/locallama */}
12
  {/* TODO: brainstorm better banner, be artsy */}
13
+ {/* TODO: banner idea: 1T tokens = 8M books
14
+ 5cm pro buech = 400km
15
+
16
+ Denn chönntme die büecher ufenandstaple und d distanz zeige ufenere charte bspw. Oder mit öppis vergliiche.
17
+ Oder für jedes buech en punkt mache
18
+ */}
19
  {/* TODO: improve the diagram for the infrastructure at the start of the section */}
20
  {/* TODO: final configuration for finephrase at the end of infra section: visualization of how many pages (500 tokens) (use page emojis flying from left to right) we can generate (real time), user can configure with a slider the number of GPUs */}
21
  {/* TODO: only explain datatrove additions when we need them (for generating the final finephrase) */}
22
  {/* TODO: move infrastructure section after analyses as precursor and explanation for finephrase */}
23
+ {/* TODO: baselines mixed with fw-edu-hq usually improve upon just baselines, but not sure if/how to present this */}
24
 
25
  {/*
26
  Notes:
app/src/content/chapters/5-infrastructure.mdx CHANGED
@@ -451,6 +451,7 @@ With a trillion-parameter model you won't be generating billions of tokens per h
451
  Further improvement ideas:
452
  - add a second model below so we can compare. Suggest something cool for the numbers below.
453
  - Also add some animations (page turning, flapping books, bookshelfes books coming in and out)
 
454
  */}
455
 
456
  To get an intuition for what these throughput numbers feel like, <FigRef target="inference-throughput" /> lets you pick a model and scale up the number of GPUs. Each page represents roughly 500 tokens of generated text. At high enough throughput, pages roll up into books (250 pages each), and books into bookshelves (250 books each).
 
451
  Further improvement ideas:
452
  - add a second model below so we can compare. Suggest something cool for the numbers below.
453
  - Also add some animations (page turning, flapping books, bookshelfes books coming in and out)
454
+ - Clean it up a bit to make it less cluttered
455
  */}
456
 
457
  To get an intuition for what these throughput numbers feel like, <FigRef target="inference-throughput" /> lets you pick a model and scale up the number of GPUs. Each page represents roughly 500 tokens of generated text. At high enough throughput, pages roll up into books (250 pages each), and books into bookshelves (250 books each).