lewtun HF Staff commited on
Commit
2d8bb5d
·
verified ·
1 Parent(s): 30c96f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -28,6 +28,10 @@ A generative DNA foundation model from the **Carbon** family.
28
 
29
  ## Model Summary
30
 
 
 
 
 
31
  **Carbon-3B** is a 3B-parameter decoder-only autoregressive genomic foundation model trained on DNA and RNA sequences, with a primary focus on eukaryotes. It has a native context length of **32,768 6-mer tokens (≈ 197k DNA base pairs)** and extends to **65,536 tokens (≈ 393 kbp)** at inference time via YaRN. Carbon-3B is designed to be both strong and efficient: on generative tasks (sequence recovery), variant-effect prediction, and motif-perturbation discrimination, it matches the capability of substantially larger single-nucleotide baselines such as Evo2-7B while running several times faster.
32
 
33
  Carbon-3B is the **flagship** model of the Carbon family. We also release [**Carbon-8B**](https://huggingface.co/HuggingFaceBio/) for users who need additional capability at higher inference cost, and [**Carbon-500M**](https://huggingface.co/HuggingFaceBio/) — a small generative model intended for speculative decoding alongside Carbon-3B (or Carbon-8B).
 
28
 
29
  ## Model Summary
30
 
31
+ <p align="center">
32
+ <img src="figures/pareto.png" alt="Pareto plot" width="800">
33
+ </p>
34
+
35
  **Carbon-3B** is a 3B-parameter decoder-only autoregressive genomic foundation model trained on DNA and RNA sequences, with a primary focus on eukaryotes. It has a native context length of **32,768 6-mer tokens (≈ 197k DNA base pairs)** and extends to **65,536 tokens (≈ 393 kbp)** at inference time via YaRN. Carbon-3B is designed to be both strong and efficient: on generative tasks (sequence recovery), variant-effect prediction, and motif-perturbation discrimination, it matches the capability of substantially larger single-nucleotide baselines such as Evo2-7B while running several times faster.
36
 
37
  Carbon-3B is the **flagship** model of the Carbon family. We also release [**Carbon-8B**](https://huggingface.co/HuggingFaceBio/) for users who need additional capability at higher inference cost, and [**Carbon-500M**](https://huggingface.co/HuggingFaceBio/) — a small generative model intended for speculative decoding alongside Carbon-3B (or Carbon-8B).