bernardo-de-almeida commited on
Commit
3e64e9c
·
1 Parent(s): 79580a1

feat: improve notebook reade

Browse files
notebooks/00_quickstart_inference.ipynb CHANGED
@@ -16,7 +16,9 @@
16
  "\n",
17
  "1. Load tokenizers + models\n",
18
  "2. Run a forward pass on a DNA sequence window\n",
19
- "3. Inspect key outputs"
 
 
20
  ]
21
  },
22
  {
@@ -24,9 +26,9 @@
24
  "id": "5d58bf1d",
25
  "metadata": {},
26
  "source": [
27
- "## 0) Install dependencies\n",
28
  "\n",
29
- "Skip if already installed."
30
  ]
31
  },
32
  {
 
16
  "\n",
17
  "1. Load tokenizers + models\n",
18
  "2. Run a forward pass on a DNA sequence window\n",
19
+ "3. Inspect key outputs\n",
20
+ "\n",
21
+ "> **Note for Google Colab users:** This notebook is compatible with Colab! For faster inference, make sure to enable GPU: Runtime → Change runtime type → GPU (T4 or better recommended)."
22
  ]
23
  },
24
  {
 
26
  "id": "5d58bf1d",
27
  "metadata": {},
28
  "source": [
29
+ "## 0) Colab Setup (if running on Google Colab)\n",
30
  "\n",
31
+ "This cell detects if you're running on Google Colab and sets up the environment accordingly."
32
  ]
33
  },
34
  {
notebooks/01_tracks_prediction.ipynb CHANGED
@@ -9,6 +9,8 @@
9
  "\n",
10
  "This notebook demonstrates how to use the **NTv3 post-trained model** to predict functional genomics tracks and genomic element annotations from DNA sequences.\n",
11
  "\n",
 
 
12
  "## Overview\n",
13
  "\n",
14
  "Given a genomic window from the **human genome (hg38)**, the model performs inference and generates:\n",
@@ -36,12 +38,14 @@
36
  "id": "4997c547",
37
  "metadata": {},
38
  "source": [
39
- "## 0) Install dependencies"
 
 
40
  ]
41
  },
42
  {
43
  "cell_type": "code",
44
- "execution_count": 23,
45
  "id": "0ff509fd",
46
  "metadata": {},
47
  "outputs": [
@@ -55,7 +59,8 @@
55
  }
56
  ],
57
  "source": [
58
- "!pip -q install \"transformers>=4.55\" \"huggingface_hub>=0.23\" safetensors torch pyfaidx requests seaborn"
 
59
  ]
60
  },
61
  {
@@ -162,7 +167,7 @@
162
  },
163
  {
164
  "cell_type": "code",
165
- "execution_count": 4,
166
  "id": "8c20066a",
167
  "metadata": {},
168
  "outputs": [
@@ -177,7 +182,7 @@
177
  }
178
  ],
179
  "source": [
180
- "def download_ucsc_chrom_fasta(chrom: str, assembly: str, out_dir: str = f\"./genomes/{assembly}\") -> str:\n",
181
  " \"\"\"Download a single chromosome FASTA from UCSC and return local path.\"\"\"\n",
182
  " os.makedirs(out_dir, exist_ok=True)\n",
183
  " gz_path = os.path.join(out_dir, f\"{chrom}.fa.gz\")\n",
 
9
  "\n",
10
  "This notebook demonstrates how to use the **NTv3 post-trained model** to predict functional genomics tracks and genomic element annotations from DNA sequences.\n",
11
  "\n",
12
+ "> **Note for Google Colab users:** This notebook is compatible with Colab! For faster inference, make sure to enable GPU: Runtime → Change runtime type → GPU (T4 or better recommended).\n",
13
+ "\n",
14
  "## Overview\n",
15
  "\n",
16
  "Given a genomic window from the **human genome (hg38)**, the model performs inference and generates:\n",
 
38
  "id": "4997c547",
39
  "metadata": {},
40
  "source": [
41
+ "## 0) Colab Setup (if running on Google Colab)\n",
42
+ "\n",
43
+ "This cell detects if you're running on Google Colab and sets up the environment accordingly."
44
  ]
45
  },
46
  {
47
  "cell_type": "code",
48
+ "execution_count": null,
49
  "id": "0ff509fd",
50
  "metadata": {},
51
  "outputs": [
 
59
  }
60
  ],
61
  "source": [
62
+ "# Install dependencies\n",
63
+ "!pip -q install \"transformers>=4.55\" \"huggingface_hub>=0.23\" safetensors torch pyfaidx requests seaborn matplotlib"
64
  ]
65
  },
66
  {
 
167
  },
168
  {
169
  "cell_type": "code",
170
+ "execution_count": null,
171
  "id": "8c20066a",
172
  "metadata": {},
173
  "outputs": [
 
182
  }
183
  ],
184
  "source": [
185
+ "def download_ucsc_chrom_fasta(chrom: str, assembly: str, out_dir: str = f\"./{assembly}\") -> str:\n",
186
  " \"\"\"Download a single chromosome FASTA from UCSC and return local path.\"\"\"\n",
187
  " os.makedirs(out_dir, exist_ok=True)\n",
188
  " gz_path = os.path.join(out_dir, f\"{chrom}.fa.gz\")\n",