bernardo-de-almeida commited on
Commit
82d9d50
·
1 Parent(s): 8e984cf

feat: improve main page

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. assets/output_tracks.png +3 -0
  3. index.html +31 -3
.gitattributes CHANGED
@@ -35,3 +35,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/paper_summary.jpg filter=lfs diff=lfs merge=lfs -text
37
  assets/paper_summary.png filter=lfs diff=lfs merge=lfs -text
 
 
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/paper_summary.jpg filter=lfs diff=lfs merge=lfs -text
37
  assets/paper_summary.png filter=lfs diff=lfs merge=lfs -text
38
+ assets/output_tracks.png filter=lfs diff=lfs merge=lfs -text
assets/output_tracks.png ADDED

Git LFS Details

  • SHA256: 5b0ffbfc022b87213f48a9d8a67dcd542227f990f936438e9fc86d49c2fa767e
  • Pointer size: 131 Bytes
  • Size of remote file: 204 kB
index.html CHANGED
@@ -233,6 +233,11 @@
233
  </p>
234
  </div>
235
 
 
 
 
 
 
236
  <div class="why-ntv3">
237
  <h2>✨ Why NTv3?</h2>
238
  <ul>
@@ -348,11 +353,14 @@ print(out.logits.shape) # (B, L, V = 11)
348
  print(len(out.hidden_states)) # convs + transformers + deconvs
349
  print(len(out.attentions)) # equals transformer layers = 12
350
  </code></pre></div>
 
 
 
351
  </div>
352
 
353
  <div class="card">
354
  <h2>💻 Use a post-trained model</h2>
355
- <p>Here is a quick example of how to use the post-trained NTv3 650M model on a human genomic window.</p>
356
  <div class="code"><pre><code class="language-python">from transformers import AutoConfig
357
 
358
  model_name = "InstaDeepAI/NTv3_650M"
@@ -375,13 +383,33 @@ out = pipe(
375
  print(out.bigwig_tracks_logits.shape) # functional track predictions
376
  print(out.bed_tracks_logits.shape) # genome annotation predictions
377
  print(out.mlm_logits.shape) # MLM logits: (B, L, V = 11)</code></pre></div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
378
  </div>
379
  </div>
380
 
381
- <div class="paper-summary">
382
  <h2>📄 A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction</h2>
383
  <img src="assets/paper_summary.png" alt="NTv3 Paper Summary" />
384
- </div>
385
 
386
  <p class="footer">
387
  © instadeep-ai — NTv3 companion Space.
 
233
  </p>
234
  </div>
235
 
236
+ <div class="paper-summary">
237
+ <!-- <h2>📄 A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction</h2> -->
238
+ <img src="assets/paper_summary.png" alt="NTv3 Paper Summary" />
239
+ </div>
240
+
241
  <div class="why-ntv3">
242
  <h2>✨ Why NTv3?</h2>
243
  <ul>
 
353
  print(len(out.hidden_states)) # convs + transformers + deconvs
354
  print(len(out.attentions)) # equals transformer layers = 12
355
  </code></pre></div>
356
+ <p>Model embeddings can be used for fine-tuning on downstream tasks.</p>
357
+
358
+ <p style="margin-top: 40px;">TO DO: add pipeline for fine-tuning on functional tracks or genome annotation.</p>
359
  </div>
360
 
361
  <div class="card">
362
  <h2>💻 Use a post-trained model</h2>
363
+ <p>Here is a quick example of how to use the post-trained NTv3 650M model to predict tracks for a human genomic window.</p>
364
  <div class="code"><pre><code class="language-python">from transformers import AutoConfig
365
 
366
  model_name = "InstaDeepAI/NTv3_650M"
 
383
  print(out.bigwig_tracks_logits.shape) # functional track predictions
384
  print(out.bed_tracks_logits.shape) # genome annotation predictions
385
  print(out.mlm_logits.shape) # MLM logits: (B, L, V = 11)</code></pre></div>
386
+ <p>Predictions can also be plotted for a subset of functional tracks and genomic elements:</p>
387
+ <div class="code"><pre><code class="language-python">tracks_to_plot = {
388
+ "K562 RNA-seq": "ENCSR056HPM",
389
+ "K562 DNAse": "ENCSR921NMD",
390
+ "K562 H3k4me3": "ENCSR000DWD",
391
+ "K562 CTCF": "ENCSR000AKO",
392
+ "HepG2 RNA-seq": "ENCSR561FEE_P",
393
+ "HepG2 DNAse": "ENCSR000EJV",
394
+ "HepG2 H3k4me3": "ENCSR000AMP",
395
+ "HepG2 CTCF": "ENCSR000BIE",
396
+ }
397
+ elements_to_plot = ["protein_coding_gene", "exon", "intron", "splice_donor", "splice_acceptor"]
398
+
399
+ out = pipe(
400
+ {"chrom": "chr19", "start": 6_700_000, "end": 6_831_072, "species": "human"},
401
+ plot=True,
402
+ tracks_to_plot=tracks_to_plot,
403
+ elements_to_plot=elements_to_plot,
404
+ )</code></pre></div>
405
+ <img src="assets/output_tracks.png" alt="Output tracks visualization" style="max-width: 100%; margin-top: 20px;" />
406
  </div>
407
  </div>
408
 
409
+ <!-- <div class="paper-summary">
410
  <h2>📄 A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction</h2>
411
  <img src="assets/paper_summary.png" alt="NTv3 Paper Summary" />
412
+ </div> -->
413
 
414
  <p class="footer">
415
  © instadeep-ai — NTv3 companion Space.