bernardo-de-almeida commited on
Commit
10addeb
·
1 Parent(s): 7d1c75c

Initial NTv3 Space structure and README

Browse files
Files changed (6) hide show
  1. README.md +76 -6
  2. assets/.gitkeep +0 -0
  3. assets/instadeep_logo.png +0 -0
  4. index.html +160 -17
  5. notebooks/.gitkeep +0 -0
  6. src/.gitkeep +0 -0
README.md CHANGED
@@ -1,11 +1,81 @@
1
  ---
2
- title: Ntv3
3
- emoji: 🔥
4
- colorFrom: blue
5
- colorTo: purple
6
  sdk: static
7
  pinned: false
8
- short_description: Nucleotide Transformer v3 models for genomics
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: NTv3 — PyTorch notebooks
3
+ emoji: 🧬
4
+ colorFrom: indigo
5
+ colorTo: blue
6
  sdk: static
7
  pinned: false
 
8
  ---
9
 
10
+ # NTv3 PyTorch notebooks
11
+
12
+ This Space is the companion hub for NTv3 checkpoints on the Hugging Face Hub. It provides PyTorch notebooks and minimal examples for inference, sequence-to-function prediction (functional tracks), genome annotation, fine-tuning, model interpretation and sequence generation.
13
+
14
+ ## Notebooks
15
+
16
+ Notebooks live in `./notebooks/`:
17
+
18
+ - `00_quickstart_inference.ipynb` — load a checkpoint + run inference
19
+ - `01_tracks_prediction.ipynb` — sequence → functional tracks (+ plotting)
20
+ - `02_genome_annotation_segmentation.ipynb` — sequence → annotation
21
+ - `03_finetune_head.ipynb` — fine-tune on a bigwig track
22
+ - `04_model_interpretation.ipynb` — interpretation of post-trained model
23
+ - `05_sequence_generation.ipynb` — fine-tune NTv3 to generate enhancer sequences
24
+
25
+ ## Install
26
+
27
+ ```bash
28
+ pip install torch transformers accelerate safetensors huggingface_hub numpy
29
+ ```
30
+
31
+ ## Load a model
32
+
33
+ ```python
34
+
35
+
36
+ ```
37
+
38
+ ## Pipelines (To DO)
39
+
40
+ ```python
41
+ from transformers import pipeline
42
+ import torch
43
+
44
+ pipe = pipeline(
45
+ task="ntv3-tracks",
46
+ model="InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb",
47
+ trust_remote_code=True,
48
+ device="cuda",
49
+ torch_dtype=torch.bfloat16,
50
+ )
51
+
52
+ out = pipe("ACGT...")
53
+ ```
54
+
55
+ ## Checkpoints
56
+
57
+ **Pre-trained:** `InstaDeepAI/ntv3_8M_7downsample_pretrained_le_1mb`, `InstaDeepAI/ntv3_106M_7downsample_pretrained_le_1mb`, `InstaDeepAI/ntv3_650M_7downsample_pretrained_le_1mb`
58
+
59
+ **Post-trained:** `InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb`, `InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb`
60
+
61
+ ## Links
62
+
63
+ - **Paper:** (add link)
64
+ - **JAX research code (GitHub):** [https://github.com/instadeepai/nucleotide-transformer](https://github.com/instadeepai/nucleotide-transformer)
65
+
66
+ ## Citation
67
+
68
+ ```bibtex
69
+ @article{ntv3,
70
+ title = {A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction},
71
+ author = {…},
72
+ journal = {…},
73
+ year = {…}
74
+ }
75
+ ```
76
+
77
+ ## License
78
+
79
+ **Code & notebooks in this Space:** (choose and add, e.g., Apache-2.0)
80
+
81
+ **Model weights:** see the license specified in each model repository
assets/.gitkeep ADDED
File without changes
assets/instadeep_logo.png ADDED
index.html CHANGED
@@ -1,19 +1,162 @@
1
  <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  </html>
 
1
  <!doctype html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="utf-8" />
5
+ <meta name="viewport" content="width=device-width,initial-scale=1" />
6
+ <title>NTv3 PyTorch Notebooks</title>
7
+ <meta name="description" content="NTv3 companion hub: PyTorch notebooks for inference, fine-tuning, interpretation, and sequence generation on NTv3 models hosted on Hugging Face." />
8
+ <style>
9
+ :root {
10
+ --bg: #0b1020;
11
+ --card: rgba(255, 255, 255, 0.06);
12
+ --text: rgba(255, 255, 255, 0.92);
13
+ --muted: rgba(255, 255, 255, 0.65);
14
+ --link: #7dd3fc;
15
+ --border: rgba(255, 255, 255, 0.12);
16
+ --shadow: 0 10px 30px rgba(0,0,0,0.35);
17
+ --radius: 18px;
18
+ --mono: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
19
+ --sans: ui-sans-serif, system-ui, -apple-system, Segoe UI, Roboto, Helvetica, Arial, "Apple Color Emoji","Segoe UI Emoji";
20
+ }
21
+ body {
22
+ margin: 0;
23
+ font-family: var(--sans);
24
+ color: var(--text);
25
+ background:
26
+ radial-gradient(1200px 800px at 10% 10%, rgba(125, 211, 252, 0.12), transparent 60%),
27
+ radial-gradient(1200px 800px at 90% 30%, rgba(167, 139, 250, 0.12), transparent 55%),
28
+ var(--bg);
29
+ min-height: 100vh;
30
+ }
31
+ .wrap { max-width: 980px; margin: 0 auto; padding: 44px 18px 56px; }
32
+ .hero {
33
+ display: grid; gap: 14px;
34
+ padding: 26px 24px;
35
+ border: 1px solid var(--border);
36
+ background: var(--card);
37
+ box-shadow: var(--shadow);
38
+ border-radius: var(--radius);
39
+ }
40
+ h1 { font-size: 34px; margin: 0; letter-spacing: -0.02em; }
41
+ p { margin: 0; color: var(--muted); line-height: 1.5; }
42
+ .grid {
43
+ margin-top: 18px;
44
+ display: grid;
45
+ grid-template-columns: repeat(12, 1fr);
46
+ gap: 14px;
47
+ }
48
+ .card {
49
+ grid-column: span 6;
50
+ padding: 18px 18px;
51
+ border: 1px solid var(--border);
52
+ background: var(--card);
53
+ border-radius: var(--radius);
54
+ box-shadow: 0 6px 18px rgba(0,0,0,0.22);
55
+ }
56
+ .card h2 { margin: 0 0 10px 0; font-size: 16px; letter-spacing: 0.01em; }
57
+ .card ul { margin: 0; padding-left: 18px; color: var(--muted); }
58
+ .card li { margin: 8px 0; }
59
+ a { color: var(--link); text-decoration: none; }
60
+ a:hover { text-decoration: underline; }
61
+ .pillrow { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; }
62
+ .pill {
63
+ font-size: 12px;
64
+ padding: 6px 10px;
65
+ border-radius: 999px;
66
+ border: 1px solid var(--border);
67
+ background: rgba(255,255,255,0.04);
68
+ color: var(--muted);
69
+ }
70
+ .code {
71
+ margin-top: 12px;
72
+ padding: 16px 18px;
73
+ border-radius: 14px;
74
+ border: 1px solid var(--border);
75
+ background: rgba(0,0,0,0.3);
76
+ font-family: var(--mono);
77
+ font-size: 13px;
78
+ line-height: 1.6;
79
+ overflow-x: auto;
80
+ color: rgba(255,255,255,0.9);
81
+ white-space: pre;
82
+ }
83
+ .code code {
84
+ font-family: inherit;
85
+ font-size: inherit;
86
+ color: inherit;
87
+ }
88
+ .footer { margin-top: 22px; color: var(--muted); font-size: 13px; }
89
+ @media (max-width: 860px) {
90
+ .card { grid-column: span 12; }
91
+ h1 { font-size: 28px; }
92
+ }
93
+ </style>
94
+ </head>
95
+
96
+ <body>
97
+ <div class="wrap">
98
+ <div class="hero">
99
+ <h1>NTv3 — PyTorch notebooks on Hugging Face</h1>
100
+ <p>
101
+ This Space is the companion hub for <strong>NTv3</strong> models: runnable notebooks for inference, fine-tuning, interpretation, and sequence generation.
102
+ </p>
103
+
104
+ <div class="pillrow">
105
+ <span class="pill">Long-context genomics</span>
106
+ <span class="pill">Torch notebooks</span>
107
+ <span class="pill">Inference • Fine-tune • Interpret • Generate</span>
108
+ </div>
109
+ </div>
110
+
111
+ <div class="grid">
112
+ <div class="card">
113
+ <h2>Notebooks</h2>
114
+ <ul>
115
+ <li><a href="./notebooks/">Browse notebooks folder</a> (add .ipynb files here)</li>
116
+ <li>00 — Quickstart inference</li>
117
+ <li>01 — Tracks prediction</li>
118
+ <li>02 — Genome annotation / segmentation</li>
119
+ <li>03 — Fine-tune a head</li>
120
+ <li>04 — Model interpretation</li>
121
+ <li>05 — Sequence generation</li>
122
+ </ul>
123
+ </div>
124
+
125
+ <div class="card">
126
+ <h2>Models</h2>
127
+ <ul>
128
+ <li>Pretrained checkpoints: <a href="https://huggingface.co/InstaDeepAI/ntv3_8M_7downsample_pretrained_le_1mb"><code>InstaDeepAI/ntv3_8M_7downsample_pretrained_le_1mb</code></a>, <a href="https://huggingface.co/InstaDeepAI/ntv3_106M_7downsample_pretrained_le_1mb"><code>InstaDeepAI/ntv3_106M_7downsample_pretrained_le_1mb</code></a>, <a href="https://huggingface.co/InstaDeepAI/ntv3_650M_7downsample_pretrained_le_1mb"><code>InstaDeepAI/ntv3_650M_7downsample_pretrained_le_1mb</code></a></li>
129
+ <li>Post-trained checkpoints: <a href="https://huggingface.co/InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb"><code>InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb</code></a>, <a href="https://huggingface.co/InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb"><code>InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb</code></a></li>
130
+ </ul>
131
+ </div>
132
+
133
+ <div class="card">
134
+ <h2>Loading</h2>
135
+ <p>Depending on how the checkpoint is published, you can load via Transformers-native code or custom code.</p>
136
+ <div class="code"><code>from transformers import AutoModel, AutoTokenizer
137
+
138
+ model = AutoModel.from_pretrained(
139
+ "InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb",
140
+ trust_remote_code=True
141
+ )
142
+ tokenizer = AutoTokenizer.from_pretrained(
143
+ "InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb",
144
+ trust_remote_code=True
145
+ )</code></div>
146
+ </div>
147
+
148
+ <div class="card">
149
+ <h2>Links</h2>
150
+ <ul>
151
+ <li>Paper: (add link)</li>
152
+ <li><a href="https://github.com/instadeepai/nucleotide-transformer">JAX training code</a></li>
153
+ </ul>
154
+ </div>
155
+ </div>
156
+
157
+ <p class="footer">
158
+ © instadeep-ai — NTv3 companion Space.
159
+ </p>
160
+ </div>
161
+ </body>
162
  </html>
notebooks/.gitkeep ADDED
File without changes
src/.gitkeep ADDED
File without changes