Spaces:
Running
Pareto chart: rebuild as inline SVG, add Dataset link to banner
Browse filesReplaces /img/pareto.png with a self-contained SVG built from
pareto/pareto_data.csv. Geometry mirrors the matplotlib reference
(log-scale throughput, linear win-rate %, family badges, 275×
speedup arrow, "better/faster" indicator) but the chrome is pulled
back to fit the editorial blog tone: hairline frame + tick lines,
JetBrains Mono tabular tick labels, mono-uppercase indicator
eyebrows, and plain text data labels with a paint-order halo
instead of pill boxes. Carbon points scale up + use a heavier
label per the source script's HIGHLIGHT_LOGO_SCALE.
Also adds a "Dataset" link (HuggingFaceBio/carbon-pretraining-corpus)
to the banner resources row, between Models and Tech report so the
two HF-hub resources sit together.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- assets/styles/section-intro.css +145 -0
- demo.html +161 -3
- img/arc.webp +0 -0
- img/generator.webp +0 -0
|
@@ -444,3 +444,148 @@
|
|
| 444 |
@media (max-width: 720px) {
|
| 445 |
.cd-mols { grid-template-columns: repeat(2, 1fr); }
|
| 446 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 444 |
@media (max-width: 720px) {
|
| 445 |
.cd-mols { grid-template-columns: repeat(2, 1fr); }
|
| 446 |
}
|
| 447 |
+
|
| 448 |
+
/* ------------------------------------------------------------------ */
|
| 449 |
+
/* §0 release lede · native Pareto chart. */
|
| 450 |
+
/* Replaces /img/pareto.png with an inline SVG built from */
|
| 451 |
+
/* pareto/pareto_data.csv. Geometry mirrors the matplotlib reference, */
|
| 452 |
+
/* but the chrome is pulled back to fit the editorial blog tone: */
|
| 453 |
+
/* hairline frame + tick lines instead of the 3px black box, mono */
|
| 454 |
+
/* tabular tick labels, plain text data labels with a paper-coloured */
|
| 455 |
+
/* paint-order halo (no pill box around each marker), and the */
|
| 456 |
+
/* "better/faster" indicator styled as a small mono uppercase eyebrow */
|
| 457 |
+
/* the same way as the section labels elsewhere on the page. Carbon */
|
| 458 |
+
/* points still scale up + use a bolder label so the eye lands on */
|
| 459 |
+
/* them first. */
|
| 460 |
+
/* ------------------------------------------------------------------ */
|
| 461 |
+
.tab-lede__figure--pareto {
|
| 462 |
+
/* Wider than the default tab-lede__figure so the long x-axis
|
| 463 |
+
(decade ticks 200 → 200k) doesn't squash the right-edge labels. */
|
| 464 |
+
max-width: 760px;
|
| 465 |
+
}
|
| 466 |
+
.pareto-chart {
|
| 467 |
+
display: block;
|
| 468 |
+
width: 100%;
|
| 469 |
+
height: auto;
|
| 470 |
+
background: #ffffff;
|
| 471 |
+
border: 1px solid #cfcdbf;
|
| 472 |
+
}
|
| 473 |
+
.pareto-bg {
|
| 474 |
+
fill: #ffffff;
|
| 475 |
+
}
|
| 476 |
+
|
| 477 |
+
/* Hairline frame at the same weight as the rest of the demo's
|
| 478 |
+
section borders — the chart reads as another paper card rather
|
| 479 |
+
than a heavy matplotlib export. */
|
| 480 |
+
.pareto-frame {
|
| 481 |
+
fill: none;
|
| 482 |
+
stroke: #cfcdbf;
|
| 483 |
+
stroke-width: 1;
|
| 484 |
+
}
|
| 485 |
+
|
| 486 |
+
/* Tick marks at the same hairline weight. Tick labels in JetBrains
|
| 487 |
+
Mono with tabular nums so the decade ticks line up tabularly and
|
| 488 |
+
the chart picks up the page's technical-mono register. Dimmed so
|
| 489 |
+
they read as scale references, not primary content. */
|
| 490 |
+
.pareto-axis line {
|
| 491 |
+
stroke: #cfcdbf;
|
| 492 |
+
stroke-width: 1;
|
| 493 |
+
}
|
| 494 |
+
.pareto-axis text {
|
| 495 |
+
font-family: "JetBrains Mono", ui-monospace, monospace;
|
| 496 |
+
font-size: 13px;
|
| 497 |
+
fill: var(--ink-soft);
|
| 498 |
+
font-feature-settings: "tnum";
|
| 499 |
+
}
|
| 500 |
+
.pareto-axis--y text {
|
| 501 |
+
text-anchor: end;
|
| 502 |
+
dominant-baseline: middle;
|
| 503 |
+
}
|
| 504 |
+
.pareto-axis--x text {
|
| 505 |
+
text-anchor: middle;
|
| 506 |
+
dominant-baseline: hanging;
|
| 507 |
+
}
|
| 508 |
+
|
| 509 |
+
/* Axis titles in Inter to match the page body; italic subtitle under
|
| 510 |
+
"Throughput" carries the units in the muted ink-soft tone. */
|
| 511 |
+
.pareto-axis-title {
|
| 512 |
+
font-family: "Inter", "Helvetica Neue", sans-serif;
|
| 513 |
+
font-size: 18px;
|
| 514 |
+
font-weight: 600;
|
| 515 |
+
fill: var(--ink);
|
| 516 |
+
text-anchor: middle;
|
| 517 |
+
}
|
| 518 |
+
.pareto-axis-subtitle {
|
| 519 |
+
font-family: "Inter", "Helvetica Neue", sans-serif;
|
| 520 |
+
font-size: 13px;
|
| 521 |
+
font-style: italic;
|
| 522 |
+
fill: var(--ink-soft);
|
| 523 |
+
text-anchor: middle;
|
| 524 |
+
}
|
| 525 |
+
|
| 526 |
+
/* "Better/faster" axes-of-improvement indicator in the lower-left.
|
| 527 |
+
Arrows in muted ink, labels in the same mono-uppercase eyebrow
|
| 528 |
+
style as the section labels (banner-links, section-num, etc.)
|
| 529 |
+
so the chart's chrome doesn't read as a foreign matplotlib glyph. */
|
| 530 |
+
.pareto-indicator line {
|
| 531 |
+
stroke: var(--ink-faint);
|
| 532 |
+
stroke-width: 1.5;
|
| 533 |
+
stroke-linecap: round;
|
| 534 |
+
}
|
| 535 |
+
.pareto-indicator polygon {
|
| 536 |
+
fill: var(--ink-faint);
|
| 537 |
+
}
|
| 538 |
+
.pareto-indicator-text {
|
| 539 |
+
font-family: "JetBrains Mono", ui-monospace, monospace;
|
| 540 |
+
font-size: 10px;
|
| 541 |
+
font-weight: 500;
|
| 542 |
+
letter-spacing: 0.14em;
|
| 543 |
+
text-transform: uppercase;
|
| 544 |
+
fill: var(--ink-faint);
|
| 545 |
+
text-anchor: middle;
|
| 546 |
+
dominant-baseline: middle;
|
| 547 |
+
}
|
| 548 |
+
|
| 549 |
+
/* 275× speedup arrow — the editorial headline. Solid ink, slightly
|
| 550 |
+
thinner than before so it doesn't overpower the chart. The label
|
| 551 |
+
gets a paper-coloured paint-order halo so it reads cleanly where
|
| 552 |
+
it crosses the arrow line behind it. */
|
| 553 |
+
.pareto-speedup line {
|
| 554 |
+
stroke: var(--ink);
|
| 555 |
+
stroke-width: 2.5;
|
| 556 |
+
stroke-linecap: round;
|
| 557 |
+
}
|
| 558 |
+
.pareto-speedup polygon {
|
| 559 |
+
fill: var(--ink);
|
| 560 |
+
}
|
| 561 |
+
.pareto-speedup-label {
|
| 562 |
+
font-family: "Inter", "Helvetica Neue", sans-serif;
|
| 563 |
+
font-size: 26px;
|
| 564 |
+
font-weight: 700;
|
| 565 |
+
fill: var(--ink);
|
| 566 |
+
text-anchor: middle;
|
| 567 |
+
paint-order: stroke;
|
| 568 |
+
stroke: #ffffff;
|
| 569 |
+
stroke-width: 6px;
|
| 570 |
+
stroke-linejoin: round;
|
| 571 |
+
}
|
| 572 |
+
|
| 573 |
+
/* Data labels: plain text, no pill box. The paint-order stroke acts
|
| 574 |
+
as a paper-coloured halo so the text always reads cleanly — even
|
| 575 |
+
when it sits next to a logo or crosses a tick line. Carbon labels
|
| 576 |
+
step up in size + weight so the highlighted models still pop. */
|
| 577 |
+
.pareto-label {
|
| 578 |
+
font-family: "Inter", "Helvetica Neue", sans-serif;
|
| 579 |
+
font-size: 13px;
|
| 580 |
+
fill: var(--ink);
|
| 581 |
+
text-anchor: middle;
|
| 582 |
+
dominant-baseline: middle;
|
| 583 |
+
paint-order: stroke;
|
| 584 |
+
stroke: #ffffff;
|
| 585 |
+
stroke-width: 4px;
|
| 586 |
+
stroke-linejoin: round;
|
| 587 |
+
}
|
| 588 |
+
.pareto-point--highlight .pareto-label {
|
| 589 |
+
font-size: 15px;
|
| 590 |
+
font-weight: 600;
|
| 591 |
+
}
|
|
@@ -199,6 +199,11 @@
|
|
| 199 |
Models<span class="arrow" aria-hidden="true">↗</span>
|
| 200 |
</a>
|
| 201 |
</li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 202 |
<li>
|
| 203 |
<a href="#" target="_blank" rel="noopener">
|
| 204 |
Tech report<span class="arrow" aria-hidden="true">↗</span>
|
|
@@ -275,8 +280,161 @@
|
|
| 275 |
shipping with the full training code, the data pipeline, and the model weights.
|
| 276 |
Everything is open source on the Hugging Face Hub.
|
| 277 |
</p>
|
| 278 |
-
<figure
|
| 279 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 280 |
<figcaption>Throughput (base pairs per second, log scale) vs win rate across open DNA foundation models. Carbon 3B matches Evo2 7B's win rate at roughly 275× the throughput.</figcaption>
|
| 281 |
</figure>
|
| 282 |
</div>
|
|
@@ -486,7 +644,7 @@
|
|
| 486 |
<div class="section--two-col intro-subsection">
|
| 487 |
<div class="section-narrative">
|
| 488 |
<div class="section-num">§6 · Applications</div>
|
| 489 |
-
<div class="section-title">
|
| 490 |
<p class="lede">
|
| 491 |
A model that understands and writes DNA is useful wherever DNA is the
|
| 492 |
input or the output. There are three interesting use-cases for such
|
|
|
|
| 199 |
Models<span class="arrow" aria-hidden="true">↗</span>
|
| 200 |
</a>
|
| 201 |
</li>
|
| 202 |
+
<li>
|
| 203 |
+
<a href="https://huggingface.co/datasets/HuggingFaceBio/carbon-pretraining-corpus" target="_blank" rel="noopener">
|
| 204 |
+
Dataset<span class="arrow" aria-hidden="true">↗</span>
|
| 205 |
+
</a>
|
| 206 |
+
</li>
|
| 207 |
<li>
|
| 208 |
<a href="#" target="_blank" rel="noopener">
|
| 209 |
Tech report<span class="arrow" aria-hidden="true">↗</span>
|
|
|
|
| 280 |
shipping with the full training code, the data pipeline, and the model weights.
|
| 281 |
Everything is open source on the Hugging Face Hub.
|
| 282 |
</p>
|
| 283 |
+
<!-- Pareto chart, drawn natively as inline SVG so the figure scales
|
| 284 |
+
sharply, picks up the page's typography, and can be tuned in
|
| 285 |
+
CSS without a matplotlib re-export. Source data lives in
|
| 286 |
+
pareto/pareto_data.csv; geometry mirrors the matplotlib
|
| 287 |
+
reference (scratch/plot_pareto_winrate_throughput_8b_32k_hf.py):
|
| 288 |
+
log-scale throughput on x, linear win-rate % on y, family
|
| 289 |
+
badges sitting on each data point with a plain text label
|
| 290 |
+
below. Chrome is pulled back to match the editorial blog
|
| 291 |
+
tone — hairline frame + tick lines, mono tabular tick
|
| 292 |
+
labels, mono-uppercase "better/faster" eyebrow indicator —
|
| 293 |
+
and the data labels use a paint-order halo (see
|
| 294 |
+
.pareto-label in section-intro.css) instead of pill boxes.
|
| 295 |
+
Carbon points scale up + use a heavier label per the source
|
| 296 |
+
script's HIGHLIGHT_LOGO_SCALE so the eye lands on them. -->
|
| 297 |
+
<figure class="tab-lede__figure tab-lede__figure--pareto">
|
| 298 |
+
<svg
|
| 299 |
+
class="pareto-chart"
|
| 300 |
+
viewBox="0 0 1000 600"
|
| 301 |
+
xmlns="http://www.w3.org/2000/svg"
|
| 302 |
+
role="img"
|
| 303 |
+
aria-labelledby="pareto-title pareto-desc"
|
| 304 |
+
>
|
| 305 |
+
<title id="pareto-title">Throughput vs win rate across open DNA foundation models</title>
|
| 306 |
+
<desc id="pareto-desc">Log-scale throughput in base pairs per second on the x-axis and win-rate percentage on the y-axis. Carbon 3B and 8B sit at roughly 275 times the throughput of Arc Evo2 7B at comparable or better win rates.</desc>
|
| 307 |
+
|
| 308 |
+
<!-- Plot interior. -->
|
| 309 |
+
<rect class="pareto-bg" x="100" y="30" width="870" height="470"/>
|
| 310 |
+
|
| 311 |
+
<!-- Y axis: linear win-rate %, ticks at 0/20/40/60/80/100. The
|
| 312 |
+
plot range runs −12..108 (matches matplotlib padding) so
|
| 313 |
+
the data points have headroom above 100 and below 0 for
|
| 314 |
+
labels; only the canonical 0..100 ticks are drawn. -->
|
| 315 |
+
<g class="pareto-axis pareto-axis--y">
|
| 316 |
+
<line x1="94" y1="61.3" x2="100" y2="61.3"/>
|
| 317 |
+
<line x1="94" y1="139.7" x2="100" y2="139.7"/>
|
| 318 |
+
<line x1="94" y1="218.0" x2="100" y2="218.0"/>
|
| 319 |
+
<line x1="94" y1="296.3" x2="100" y2="296.3"/>
|
| 320 |
+
<line x1="94" y1="374.7" x2="100" y2="374.7"/>
|
| 321 |
+
<line x1="94" y1="453.0" x2="100" y2="453.0"/>
|
| 322 |
+
<text x="86" y="61.3">100</text>
|
| 323 |
+
<text x="86" y="139.7">80</text>
|
| 324 |
+
<text x="86" y="218.0">60</text>
|
| 325 |
+
<text x="86" y="296.3">40</text>
|
| 326 |
+
<text x="86" y="374.7">20</text>
|
| 327 |
+
<text x="86" y="453.0">0</text>
|
| 328 |
+
</g>
|
| 329 |
+
|
| 330 |
+
<!-- X axis: log10 base pairs/s. x-range chosen to mirror the
|
| 331 |
+
matplotlib auto-padding (left_pad/right_pad in the source);
|
| 332 |
+
ticks drop at decade + half-decade boundaries that fall
|
| 333 |
+
inside the range. -->
|
| 334 |
+
<g class="pareto-axis pareto-axis--x">
|
| 335 |
+
<line x1="163.4" y1="500" x2="163.4" y2="506"/>
|
| 336 |
+
<line x1="263.9" y1="500" x2="263.9" y2="506"/>
|
| 337 |
+
<line x1="339.9" y1="500" x2="339.9" y2="506"/>
|
| 338 |
+
<line x1="415.9" y1="500" x2="415.9" y2="506"/>
|
| 339 |
+
<line x1="516.4" y1="500" x2="516.4" y2="506"/>
|
| 340 |
+
<line x1="592.4" y1="500" x2="592.4" y2="506"/>
|
| 341 |
+
<line x1="668.5" y1="500" x2="668.5" y2="506"/>
|
| 342 |
+
<line x1="768.9" y1="500" x2="768.9" y2="506"/>
|
| 343 |
+
<line x1="844.9" y1="500" x2="844.9" y2="506"/>
|
| 344 |
+
<line x1="920.9" y1="500" x2="920.9" y2="506"/>
|
| 345 |
+
<text x="163.4" y="520">200</text>
|
| 346 |
+
<text x="263.9" y="520">500</text>
|
| 347 |
+
<text x="339.9" y="520">1k</text>
|
| 348 |
+
<text x="415.9" y="520">2k</text>
|
| 349 |
+
<text x="516.4" y="520">5k</text>
|
| 350 |
+
<text x="592.4" y="520">10k</text>
|
| 351 |
+
<text x="668.5" y="520">20k</text>
|
| 352 |
+
<text x="768.9" y="520">50k</text>
|
| 353 |
+
<text x="844.9" y="520">100k</text>
|
| 354 |
+
<text x="920.9" y="520">200k</text>
|
| 355 |
+
</g>
|
| 356 |
+
|
| 357 |
+
<!-- Plot frame drawn after the axis grid so the thick black
|
| 358 |
+
border sits cleanly on top of the tick lines. -->
|
| 359 |
+
<rect class="pareto-frame" x="100" y="30" width="870" height="470"/>
|
| 360 |
+
|
| 361 |
+
<!-- Axes-of-improvement indicator: a small ⌐ of grey arrows in
|
| 362 |
+
the lower-left labelled "better"/"faster", same as the
|
| 363 |
+
matplotlib reference. Placed at the 0-winrate gridline,
|
| 364 |
+
just inside the y-axis. -->
|
| 365 |
+
<g class="pareto-indicator" transform="translate(170 450)">
|
| 366 |
+
<line x1="0" y1="0" x2="0" y2="-70"/>
|
| 367 |
+
<polygon points="0,-78 -7,-66 7,-66"/>
|
| 368 |
+
<text class="pareto-indicator-text" transform="translate(-14 -35) rotate(-90)">better</text>
|
| 369 |
+
<line x1="0" y1="0" x2="70" y2="0"/>
|
| 370 |
+
<polygon points="78,0 66,-7 66,7"/>
|
| 371 |
+
<text class="pareto-indicator-text" x="35" y="20">faster</text>
|
| 372 |
+
</g>
|
| 373 |
+
|
| 374 |
+
<!-- 275× speedup arrow: starts just right of the Evo2 7B label
|
| 375 |
+
pill and lands just left of the Carbon 3B logo. y placed
|
| 376 |
+
between the two points (Evo2 7B at 64.3%, Carbon 3B at
|
| 377 |
+
59.5%) so it reads as level with both. -->
|
| 378 |
+
<g class="pareto-speedup">
|
| 379 |
+
<line x1="290" y1="215" x2="822" y2="215"/>
|
| 380 |
+
<polygon points="836,215 820,206 820,224"/>
|
| 381 |
+
<text class="pareto-speedup-label" x="556" y="200">275×</text>
|
| 382 |
+
</g>
|
| 383 |
+
|
| 384 |
+
<!-- Data points. Coordinates baked in from pareto_data.csv:
|
| 385 |
+
x = 100 + (log10(T) − 2.0499) / 3.4452 × 870
|
| 386 |
+
y = 500 − (win_rate + 12) × 3.9167
|
| 387 |
+
Logos sit centered on each point (32×32 for non-highlight,
|
| 388 |
+
43×43 for Carbon). Labels are pinned below the logo. -->
|
| 389 |
+
|
| 390 |
+
<!-- Evo2 20B · 177.5 bp/s, 95.24% -->
|
| 391 |
+
<g class="pareto-point">
|
| 392 |
+
<image href="/img/arc.webp" x="134.3" y="64.0" width="32" height="32"/>
|
| 393 |
+
<text class="pareto-label" x="150.3" y="110">Evo2 20B</text>
|
| 394 |
+
</g>
|
| 395 |
+
|
| 396 |
+
<!-- Evo2 7B · 453.8 bp/s, 64.29% -->
|
| 397 |
+
<g class="pareto-point">
|
| 398 |
+
<image href="/img/arc.webp" x="237.3" y="185.2" width="32" height="32"/>
|
| 399 |
+
<text class="pareto-label" x="253.3" y="231">Evo2 7B</text>
|
| 400 |
+
</g>
|
| 401 |
+
|
| 402 |
+
<!-- Evo2 1B · 1342.5 bp/s, 2.38% -->
|
| 403 |
+
<g class="pareto-point">
|
| 404 |
+
<image href="/img/arc.webp" x="356.2" y="427.7" width="32" height="32"/>
|
| 405 |
+
<text class="pareto-label" x="372.2" y="473">Evo2 1B</text>
|
| 406 |
+
</g>
|
| 407 |
+
|
| 408 |
+
<!-- GENERator-v2 3B · 98494.4 bp/s, 35.71% -->
|
| 409 |
+
<g class="pareto-point">
|
| 410 |
+
<image href="/img/generator.webp" x="828.7" y="297.1" width="32" height="32"/>
|
| 411 |
+
<text class="pareto-label" x="844.7" y="343">GENERator-v2 3B</text>
|
| 412 |
+
</g>
|
| 413 |
+
|
| 414 |
+
<!-- GENERator-v2 1.2B · 123219.2 bp/s, 14.29% -->
|
| 415 |
+
<g class="pareto-point">
|
| 416 |
+
<image href="/img/generator.webp" x="853.3" y="381.0" width="32" height="32"/>
|
| 417 |
+
<text class="pareto-label" x="869.3" y="427">GENERator-v2 1.2B</text>
|
| 418 |
+
</g>
|
| 419 |
+
|
| 420 |
+
<!-- Carbon 8B · 76582.7 bp/s, 78.57% (highlighted) -->
|
| 421 |
+
<g class="pareto-point pareto-point--highlight">
|
| 422 |
+
<image href="/img/logo.svg" x="795.6" y="123.7" width="43" height="43"/>
|
| 423 |
+
<text class="pareto-label" x="817.1" y="180">Carbon 8B</text>
|
| 424 |
+
</g>
|
| 425 |
+
|
| 426 |
+
<!-- Carbon 3B · 125130.8 bp/s, 59.52% (highlighted) -->
|
| 427 |
+
<g class="pareto-point pareto-point--highlight">
|
| 428 |
+
<image href="/img/logo.svg" x="849.5" y="198.3" width="43" height="43"/>
|
| 429 |
+
<text class="pareto-label" x="871.0" y="255">Carbon 3B</text>
|
| 430 |
+
</g>
|
| 431 |
+
|
| 432 |
+
<!-- Axis titles. Y title rotated -90 along the left margin, X
|
| 433 |
+
title + italic "Base pairs per second" subtitle below. -->
|
| 434 |
+
<text class="pareto-axis-title" transform="translate(34 265) rotate(-90)">Win rate (%)</text>
|
| 435 |
+
<text class="pareto-axis-title" x="535" y="558">Throughput</text>
|
| 436 |
+
<text class="pareto-axis-subtitle" x="535" y="582">Base pairs per second</text>
|
| 437 |
+
</svg>
|
| 438 |
<figcaption>Throughput (base pairs per second, log scale) vs win rate across open DNA foundation models. Carbon 3B matches Evo2 7B's win rate at roughly 275× the throughput.</figcaption>
|
| 439 |
</figure>
|
| 440 |
</div>
|
|
|
|
| 644 |
<div class="section--two-col intro-subsection">
|
| 645 |
<div class="section-narrative">
|
| 646 |
<div class="section-num">§6 · Applications</div>
|
| 647 |
+
<div class="section-title">What can the model do in the real world?</div>
|
| 648 |
<p class="lede">
|
| 649 |
A model that understands and writes DNA is useful wherever DNA is the
|
| 650 |
input or the output. There are three interesting use-cases for such
|
|
|