Spaces:

InstaDeepAI
/

ntv3

Running

App Files Files Community

bernardo-de-almeida commited on Dec 11, 2025

Commit

c7d398c

1 Parent(s): 1fb2a3c

feat: improve main page

Browse files

Files changed (1) hide show

index.html +166 -29

index.html CHANGED Viewed

@@ -54,9 +54,49 @@
       border-radius: var(--radius);
       box-shadow: 0 6px 18px rgba(0,0,0,0.22);
     }
     .card h2 { margin: 0 0 10px 0; font-size: 16px; letter-spacing: 0.01em; }
     .card ul { margin: 0; padding-left: 18px; color: var(--muted); }
     .card li { margin: 8px 0; }
     a { color: var(--link); text-decoration: none; }
     a:hover { text-decoration: underline; }
     .pillrow { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; }
@@ -116,6 +156,30 @@
     .summary p:last-child {
       margin-bottom: 0;
     }
     .paper-summary {
       margin-top: 12px;
       padding: 24px;
@@ -169,6 +233,17 @@
       </p>
     </div>
     <div class="grid">
       <div class="card">
         <h2>🤖 Models (see <a href="https://huggingface.co/collections/InstaDeepAI/nucleotide-transformer-v3" target="_blank" rel="noopener">collection</a>)</h2>
@@ -187,22 +262,92 @@
             </div>
           </li>
         </ul>
       </div>
-	  <div class="card">
-        <h2>📓 Notebooks (browse <a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/tree/main/notebooks" target="_blank" rel="noopener">folder</a>)</h2>
-        <ul>
-          <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/00_quickstart_inference.ipynb" target="_blank" rel="noopener">🚀 00 — Quickstart inference</a></li>
-          <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/01_tracks_prediction.ipynb" target="_blank" rel="noopener">📊 01 — Tracks prediction</a></li>
-          <li>🏷️ 02 — Genome annotation / segmentation</li>
-          <li>🎯 03 — Fine-tune on bigwig tracks</li>
-          <li>🔍 04 — Model interpretation</li>
-          <li>🧪 05 — Sequence generation</li>
-        </ul>
       </div>
       <div class="card">
-        <h2>💻 Model usage</h2>
         <p>Here is a quick example of how to use the post-trained NTv3 650M model on a human genomic window.</p>
         <div class="code"><pre><code class="language-python">from transformers import AutoConfig
@@ -214,29 +359,21 @@ pipe = cfg.load_tracks_pipeline(model_name, device="auto")  # or "cpu"/"cuda"/"m
 # Run track prediction
 out = pipe(
-    {
-        "chrom": "chr19",
-        "start": 6_700_000,
-        "end": 6_831_072,
-        "species": "human"
-    }
 )
 print(out.bigwig_tracks_logits.shape)   # functional track predictions
 print(out.bed_tracks_logits.shape)      # genome annotation predictions
 print(out.mlm_logits.shape)             # MLM logits: (B, L, V = 11)</code></pre></div>
-      </div>
-      <div class="card">
-        <h2>🔗 Links</h2>
-        <ul>
-          <li>📄 Paper: (add link)</li>
-          <li><a href="https://github.com/instadeepai/nucleotide-transformer">💻 JAX model code (GitHub)</a></li>
-		  <li>🏆 NTv3 benchmark leaderboard: (add link)</li>
-        </ul>
-      </div>
-    </div>
     <div class="paper-summary">
 		<h2>📄 A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction</h2>
 		<img src="assets/paper_summary.png" alt="NTv3 Paper Summary" />

       border-radius: var(--radius);
       box-shadow: 0 6px 18px rgba(0,0,0,0.22);
     }
+    .card-stack {
+      grid-column: span 6;
+      display: flex;
+      flex-direction: column;
+      gap: 14px;
+    }
+    .card-stack .card {
+      grid-column: span 1;
+      margin: 0;
+    }
     .card h2 { margin: 0 0 10px 0; font-size: 16px; letter-spacing: 0.01em; }
     .card ul { margin: 0; padding-left: 18px; color: var(--muted); }
     .card li { margin: 8px 0; }
+    .card table {
+      width: 100%;
+      margin-top: 12px;
+      border-collapse: collapse;
+      font-size: 13px;
+    }
+    .card table th {
+      text-align: left;
+      padding: 10px 12px;
+      border-bottom: 2px solid var(--border);
+      color: var(--text);
+      font-weight: 600;
+      font-size: 12px;
+      text-transform: uppercase;
+      letter-spacing: 0.05em;
+    }
+    .card table td {
+      padding: 10px 12px;
+      border-bottom: 1px solid var(--border);
+      color: var(--muted);
+    }
+    .card table tr:last-child td {
+      border-bottom: none;
+    }
+    .card table tr:hover {
+      background: rgba(255, 255, 255, 0.02);
+    }
+    .card table td .checkmark {
+      color: #4ade80 !important;
+    }
     a { color: var(--link); text-decoration: none; }
     a:hover { text-decoration: underline; }
     .pillrow { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; }
     .summary p:last-child {
       margin-bottom: 0;
     }
+    .why-ntv3 {
+      margin-top: 18px;
+      padding: 24px;
+      border: 1px solid var(--border);
+      background: var(--card);
+      border-radius: var(--radius);
+      box-shadow: var(--shadow);
+    }
+    .why-ntv3 h2 {
+      margin: 0 0 16px 0;
+      font-size: 18px;
+      letter-spacing: 0.01em;
+    }
+    .why-ntv3 ul {
+      margin: 0;
+      padding-left: 0;
+      list-style: none;
+      color: var(--muted);
+    }
+    .why-ntv3 li {
+      margin: 12px 0;
+      padding-left: 0;
+      line-height: 1.7;
+    }
     .paper-summary {
       margin-top: 12px;
       padding: 24px;
       </p>
     </div>
+    <div class="why-ntv3">
+      <h2>✨ Why NTv3?</h2>
+      <ul>
+        <li>📏 <strong>1 Mb long context at nucleotide resolution</strong> — ~100× longer than typical genomics models.</li>
+        <li>🔗 <strong>Unified architecture</strong> for: masked language modeling, functional-track prediction, genome annotation, and sequence generation.</li>
+        <li>🌍 <strong>Cross-species generalization</strong> across 24 animals + plants with a shared conditioned representation space.</li>
+        <li>⚡ <strong>U-Net–style architecture</strong> improves stability and GPU efficiency on very long sequences.</li>
+        <li>🎯 <strong>Controllable generative modeling</strong>, enabling targeted enhancer/promoter engineering validated by experimental assays.</li>
+      </ul>
+    </div>
     <div class="grid">
       <div class="card">
         <h2>🤖 Models (see <a href="https://huggingface.co/collections/InstaDeepAI/nucleotide-transformer-v3" target="_blank" rel="noopener">collection</a>)</h2>
             </div>
           </li>
         </ul>
+        <table>
+          <thead>
+            <tr>
+              <th>Model</th>
+              <th>Size</th>
+              <th>Pre-training</th>
+              <th>Post-training</th>
+              <th>Tasks</th>
+            </tr>
+          </thead>
+          <tbody>
+            <tr>
+              <td><strong>NTv3-8M</strong></td>
+              <td>8M params</td>
+              <td>MLM</td>
+              <td>❌</td>
+              <td>Embeddings, light inference</td>
+            </tr>
+            <tr>
+              <td><strong>NTv3-100M</strong></td>
+              <td>100M params</td>
+              <td>MLM</td>
+              <td><span class="checkmark">✅</span></td>
+              <td>Tracks, annotation</td>
+            </tr>
+            <tr>
+              <td><strong>NTv3-650M</strong></td>
+              <td>650M params</td>
+              <td>MLM</td>
+              <td><span class="checkmark">✅</span></td>
+              <td>Tracks, annotation, best accuracy</td>
+            </tr>
+          </tbody>
+        </table>
       </div>
+      <div class="card-stack">
+        <div class="card">
+          <h2>📓 Notebooks (browse <a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/tree/main/notebooks" target="_blank" rel="noopener">folder</a>)</h2>
+          <ul>
+            <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/00_quickstart_inference.ipynb" target="_blank" rel="noopener">🚀 00 — Quickstart inference</a></li>
+            <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/01_tracks_prediction.ipynb" target="_blank" rel="noopener">📊 01 — Tracks prediction</a></li>
+            <li>🏷️ 02 — Genome annotation / segmentation</li>
+            <li>🎯 03 — Fine-tune on bigwig tracks</li>
+            <li>🔍 04 — Model interpretation</li>
+            <li>🧪 05 — Sequence generation</li>
+          </ul>
+        </div>
+        <div class="card">
+          <h2>🔗 Links</h2>
+          <ul>
+            <li>📄 Paper: (add link)</li>
+            <li><a href="https://github.com/instadeepai/nucleotide-transformer">💻 JAX model code (GitHub)</a></li>
+            <li><a href="https://huggingface.co/collections/InstaDeepAI/nucleotide-transformer-v3" target="_blank" rel="noopener">🎯 HF Model Collection (all NTv3 models)</a></li>
+            <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/tree/main/notebooks" target="_blank" rel="noopener">📓 All notebooks</a></li>
+            <li>🏆 NTv3 benchmark leaderboard: (add link)</li>
+          </ul>
+        </div>
       </div>
       <div class="card">
+        <h2>🤖 Load a pre-trained model</h2>
+        <p>Here is an example of how to load and use a pre-trained NTv3 model.</p>
+        <div class="code"><pre><code class="language-python">from transformers import AutoTokenizer, AutoModelForMaskedLM
+model_name = "InstaDeepAI/NTv3_650M_pre"
+# Load model and tokenizer
+model = AutoModelForMaskedLM.from_pretrained(model_name, trust_remote_code=True)
+tok = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+# Tokenize input sequences
+batch = tok(["ATCGNATCG", "ACGT"], add_special_tokens=False, padding=True, pad_to_multiple_of=128, return_tensors="pt")
+# Run model
+out = model(**batch, output_hidden_states=True, output_attentions=True)
+# Print output shapes
+print(out.logits.shape)       # (B, L, V = 11)
+print(len(out.hidden_states)) # convs + transformers + deconvs
+print(len(out.attentions))    # equals transformer layers = 12
+</code></pre></div>
+      </div>
+      <div class="card">
+        <h2>💻 Use a post-trained model</h2>
         <p>Here is a quick example of how to use the post-trained NTv3 650M model on a human genomic window.</p>
         <div class="code"><pre><code class="language-python">from transformers import AutoConfig
 # Run track prediction
 out = pipe(
+  {
+    "chrom": "chr19",
+    "start": 6_700_000,
+    "end": 6_831_072,
+    "species": "human"
+  }
 )
+# Print output shapes
 print(out.bigwig_tracks_logits.shape)   # functional track predictions
 print(out.bed_tracks_logits.shape)      # genome annotation predictions
 print(out.mlm_logits.shape)             # MLM logits: (B, L, V = 11)</code></pre></div>
+          </div>
+        </div>
     <div class="paper-summary">
 		<h2>📄 A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction</h2>
 		<img src="assets/paper_summary.png" alt="NTv3 Paper Summary" />