Spaces:

InstaDeepAI
/

ntv3

Running

App Files Files Community

bernardo-de-almeida commited on Dec 12, 2025

Commit

1094a5f

1 Parent(s): 2101d19

fix: notebooks with new post-trained model formats

Browse files

Files changed (3) hide show

index.html +9 -6
notebooks/00_quickstart_inference.ipynb +46 -33
notebooks/01_tracks_prediction.ipynb +193 -47

index.html CHANGED Viewed

@@ -262,8 +262,8 @@
           </li>
           <li>🎯 Post-trained checkpoints:
             <div style="margin-top: 8px; margin-left: 0;">
-              <div><a href="https://huggingface.co/InstaDeepAI/NTv3_100M"><code>InstaDeepAI/NTv3_100M</code></a></div>
-              <div><a href="https://huggingface.co/InstaDeepAI/NTv3_650M"><code>InstaDeepAI/NTv3_650M</code></a></div>
             </div>
           </li>
         </ul>
@@ -309,7 +309,7 @@
           <ul>
             <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/00_quickstart_inference.ipynb" target="_blank" rel="noopener">🚀 00 — Quickstart inference</a></li>
             <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/01_tracks_prediction.ipynb" target="_blank" rel="noopener">📊 01 — Tracks prediction</a></li>
-            <li>🏷️ 02 — Genome annotation / segmentation</li>
             <li>🎯 03 — Fine-tune on bigwig tracks</li>
             <li>🔍 04 — Model interpretation</li>
             <li>🧪 05 — Sequence generation</li>
@@ -380,9 +380,12 @@ out = pipe(
 )
 # Print output shapes
-print(out.bigwig_tracks_logits.shape)   # functional track predictions
-print(out.bed_tracks_logits.shape)      # genome annotation predictions
-print(out.mlm_logits.shape)             # MLM logits: (B, L, V = 11)</code></pre></div>
       <p>Predictions can also be plotted for a subset of functional tracks and genomic elements:</p>
       <div class="code"><pre><code class="language-python">tracks_to_plot = {
     "K562 RNA-seq": "ENCSR056HPM",

           </li>
           <li>🎯 Post-trained checkpoints:
             <div style="margin-top: 8px; margin-left: 0;">
+              <div><a href="https://huggingface.co/InstaDeepAI/NTv3_100M_pos"><code>InstaDeepAI/NTv3_100M_pos</code></a></div>
+              <div><a href="https://huggingface.co/InstaDeepAI/NTv3_650M_pos"><code>InstaDeepAI/NTv3_650M_pos</code></a></div>
             </div>
           </li>
         </ul>
           <ul>
             <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/00_quickstart_inference.ipynb" target="_blank" rel="noopener">🚀 00 — Quickstart inference</a></li>
             <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/01_tracks_prediction.ipynb" target="_blank" rel="noopener">📊 01 — Tracks prediction</a></li>
+            <li><a href="https://huggingface.co/spaces/InstaDeepAI/ntv3/blob/main/notebooks/02_genome_annotation.ipynb" target="_blank" rel="noopener">🏷️ 02 — Genome annotation / segmentation</a></li>
             <li>🎯 03 — Fine-tune on bigwig tracks</li>
             <li>🔍 04 — Model interpretation</li>
             <li>🧪 05 — Sequence generation</li>
 )
 # Print output shapes
+# 7k human tracks over 37.5 % center region of the input sequence
+print("bigwig_tracks_logits:", tuple(out.bigwig_tracks_logits.shape))
+# Location of 21 genomic elements over 37.5 % center region of the input sequence
+print("bed_tracks_logits:", tuple(out.bed_tracks_logits.shape))
+# Language model logits for whole sequence over vocabulary
+print("language model logits:", tuple(out.mlm_logits.shape))</code></pre></div>
       <p>Predictions can also be plotted for a subset of functional tracks and genomic elements:</p>
       <div class="code"><pre><code class="language-python">tracks_to_plot = {
     "K562 RNA-seq": "ENCSR056HPM",

notebooks/00_quickstart_inference.ipynb CHANGED Viewed

@@ -10,7 +10,7 @@
         "This notebook demonstrates how to run **quick inference** with both the pre- and post-trained NTv3 checkpoints:\n",
         "\n",
         "- **Pre-trained (MLM-focused):** `InstaDeepAI/NTv3_8M_pre`, `InstaDeepAI/NTv3_100M_pre`, `InstaDeepAI/NTv3_650M_pre`\n",
-        "- **Post-trained (task heads):** `InstaDeepAI/NTv3_100M`, `InstaDeepAI/NTv3_650M`\n",
         "\n",
         "We show how to:\n",
         "\n",
@@ -51,7 +51,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 7,
       "id": "d56c105b",
       "metadata": {},
       "outputs": [
@@ -287,18 +287,40 @@
       "source": [
         "## 3) 🧠 Post-trained checkpoint (task heads: BigWig + BED)\n",
         "\n",
-        "Post-trained checkpoints add task-specific heads.\n",
         "\n",
         "In particular:\n",
-        "- `condition_tokenizer` is used to tokenize a species condition like `\"human\"`\n",
-        "- `file_assembly_idx` selects the assembly (e.g., `hg38`)\n",
         "\n",
         "Expected outputs:\n",
-        "- `bigwig_tracks_logits`\n",
-        "- `bed_tracks_logits`\n",
-        "- `logits` (MLM)\n",
         "\n",
-        "> 💡 If your post-trained checkpoint supports multiple assemblies, the config typically exposes a mapping like `cfg.bigwigs_per_file_assembly`."
       ]
     },
     {
@@ -318,39 +340,30 @@
         }
       ],
       "source": [
-        "posttrained_model_name = \"InstaDeepAI/NTv3_100M\"\n",
-        "\n",
-        "# Load config/tokenizers/model\n",
-        "cfg_pos = AutoConfig.from_pretrained(posttrained_model_name, trust_remote_code=True)\n",
-        "tok_pos = AutoTokenizer.from_pretrained(posttrained_model_name, trust_remote_code=True)\n",
-        "model_pos = AutoModel.from_pretrained(posttrained_model_name, trust_remote_code=True)\n",
-        "condition_tokenizer = AutoTokenizer.from_pretrained(\n",
-        "    posttrained_model_name, subfolder=\"condition_tokenizer\", trust_remote_code=True\n",
-        ")\n",
         "\n",
-        "# Example: human sequence (sequence needs to be multiple of 128 due to 7 downsampling in model)\n",
-        "seq = \"ATCG\" * 512\n",
-        "batch = tok_pos([seq], add_special_tokens=False, return_tensors=\"pt\")\n",
-        "condition = condition_tokenizer([\"human\"], return_tensors=\"pt\")\n",
         "\n",
-        "# Get assembly index for human (hg38)\n",
-        "assemblies = list(cfg_pos.bigwigs_per_file_assembly.keys())\n",
-        "assembly_idx = torch.tensor([assemblies.index(\"hg38\")])\n",
         "\n",
-        "out = model_pos(\n",
         "    input_ids=batch[\"input_ids\"],\n",
-        "    condition_ids=[condition[\"input_ids\"][0]],\n",
-        "    file_assembly_idx=assembly_idx,\n",
-        "    output_hidden_states=True,\n",
-        "    output_attentions=True,\n",
         ")\n",
         "\n",
         "# 7k human tracks over 37.5 % center region of the input sequence\n",
-        "print(\"bigwig_tracks_logits:\", out[\"bigwig_tracks_logits\"].shape)\n",
         "# Location of 21 genomic elements over 37.5 % center region of the input sequence\n",
-        "print(\"bed_tracks_logits:\", out[\"bed_tracks_logits\"].shape)\n",
         "# Language model logits for whole sequence over vocabulary\n",
-        "print(\"language model logits:\", out[\"logits\"].shape)"
       ]
     }
   ],

         "This notebook demonstrates how to run **quick inference** with both the pre- and post-trained NTv3 checkpoints:\n",
         "\n",
         "- **Pre-trained (MLM-focused):** `InstaDeepAI/NTv3_8M_pre`, `InstaDeepAI/NTv3_100M_pre`, `InstaDeepAI/NTv3_650M_pre`\n",
+        "- **Post-trained (functional tracks and genome annotation):** `InstaDeepAI/NTv3_100M_pos`, `InstaDeepAI/NTv3_650M_pos`\n",
         "\n",
         "We show how to:\n",
         "\n",
     },
     {
       "cell_type": "code",
+      "execution_count": 3,
       "id": "d56c105b",
       "metadata": {},
       "outputs": [
       "source": [
         "## 3) 🧠 Post-trained checkpoint (task heads: BigWig + BED)\n",
         "\n",
+        "Post-trained checkpoints add task-specific heads for functional track prediction and genome annotation.\n",
         "\n",
         "In particular:\n",
+        "- `species_tokenizer` is used to tokenize a species condition like `\"human\"`\n",
+        "- `species_ids` passes the species tokens to the model\n",
         "\n",
         "Expected outputs:\n",
+        "- `bigwig_tracks_logits`: functional track predictions\n",
+        "- `bed_tracks_logits`: genome annotation predictions\n",
+        "- `logits`: masked language modeling logits"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 9,
+      "id": "bdb8c4d1",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Model supported species: TO BE DONE\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Inspect config and supported species\n",
+        "post_trained_model_name = \"InstaDeepAI/NTv3_100M_pos\"\n",
+        "\n",
+        "cfg_post = AutoConfig.from_pretrained(post_trained_model_name, trust_remote_code=True)\n",
         "\n",
+        "species = \"TO BE DONE\"\n",
+        "print(\"Model supported species:\", species)"
       ]
     },
     {
         }
       ],
       "source": [
+        "tok_post = AutoTokenizer.from_pretrained(post_trained_model_name, trust_remote_code=True)\n",
+        "cond_tok_post = AutoTokenizer.from_pretrained(post_trained_model_name, subfolder='species_tokenizer', trust_remote_code=True)\n",
+        "model_post = AutoModel.from_pretrained(post_trained_model_name, trust_remote_code=True)\n",
         "\n",
+        "# Prepare inputs\n",
+        "batch = tok_post([\"ATCGNATCG\", \"ACGT\"], add_special_tokens=False, padding=True, pad_to_multiple_of=128, return_tensors=\"pt\")\n",
         "\n",
+        "# Condition tokens (e.g., species)\n",
+        "species = 'human'\n",
+        "species_ids = cond_tok_post([species] * len(batch['input_ids']), add_special_tokens=False, return_tensors='pt')\n",
         "\n",
+        "# Forward pass\n",
+        "out = model_post(\n",
         "    input_ids=batch[\"input_ids\"],\n",
+        "    species_ids=species_ids['input_ids'],\n",
+        "    return_dict=True\n",
         ")\n",
         "\n",
         "# 7k human tracks over 37.5 % center region of the input sequence\n",
+        "print(\"bigwig_tracks_logits:\", tuple(out[\"bigwig_tracks_logits\"].shape))\n",
         "# Location of 21 genomic elements over 37.5 % center region of the input sequence\n",
+        "print(\"bed_tracks_logits:\", tuple(out[\"bed_tracks_logits\"].shape))\n",
         "# Language model logits for whole sequence over vocabulary\n",
+        "print(\"language model logits:\", tuple(out[\"logits\"].shape))\n"
       ]
     }
   ],

notebooks/01_tracks_prediction.ipynb CHANGED Viewed

@@ -19,6 +19,8 @@
         "- **Genomic element annotations** (`bed_tracks_logits`): Classification predictions for genomic elements such as genes, exons, introns, splice sites, promoters, enhancers, and more\n",
         "- **Masked Language Model logits** (`logits`): Standard transformer language model outputs\n",
         "\n",
         "## 📚 Notebook Structure\n",
         "\n",
         "1. **Setup**: Install dependencies and define the genomic window of interest\n",
@@ -33,16 +35,6 @@
         "- Supports the 24 species that NTv3 was post-trained on"
       ]
     },
-    {
-      "cell_type": "markdown",
-      "id": "4997c547",
-      "metadata": {},
-      "source": [
-        "## 0) Colab Setup (if running on Google Colab)\n",
-        "\n",
-        "This cell detects if you're running on Google Colab and sets up the environment accordingly."
-      ]
-    },
     {
       "cell_type": "code",
       "execution_count": null,
@@ -65,7 +57,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
       "id": "608d67e1",
       "metadata": {},
       "outputs": [],
@@ -96,7 +88,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
       "id": "795a576f",
       "metadata": {},
       "outputs": [
@@ -112,15 +104,16 @@
         "# -----------------------------\n",
         "# User inputs\n",
         "# -----------------------------\n",
-        "model_name = \"InstaDeepAI/NTv3_100M\" # options: \"InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb\" or \"InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb_v2\"\n",
         "\n",
         "# Example window from a given species (edit these) - needs to be multiple of 128 due to the model downsampling\n",
-        "assembly = \"hg38\"\n",
         "chrom = \"chr19\"\n",
         "start = 6_700_000\n",
         "end   = 6_831_072\n",
         "\n",
-        "# Optional: if the model is gated/private, set HF_TOKEN to a PERSONAL token (hf_...)\n",
         "HF_TOKEN = os.getenv(\"HF_TOKEN\", None)\n",
         "\n",
         "assert end > start, \"end must be > start\"\n",
@@ -138,7 +131,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 3,
       "id": "2354e2aa",
       "metadata": {},
       "outputs": [
@@ -175,7 +168,8 @@
           "name": "stdout",
           "output_type": "stream",
           "text": [
-            "Using downloaded chromosome FASTA: ./genomes/hg38/chr19.fa\n",
             "Sequence preview: GTCAACAATAACAAATGACATATTAGTAGTAAATTATAATTATACATTACAACAAAATTA...\n",
             "Valid DNA: True\n"
           ]
@@ -234,47 +228,198 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 5,
       "id": "e09f0469",
       "metadata": {},
       "outputs": [
         {
-          "name": "stdout",
-          "output_type": "stream",
-          "text": [
-            "Model supported assemblies: ['AmpOce1', 'Bison_UMD1', 'ChiLan1', 'Felis_catus_9', 'GRCz11', 'Glycine_max_v2.1', 'Gossypium_hirsutum_v2.1', 'IRGSP-1.0', 'IWGSC', 'KH', 'Mnem_1', 'ROS_Cfam_1', 'SCA1', 'TAIR10', 'TETRAODON8', 'WBcel235', 'Zm-B73-REFERENCE-NAM-5.0', 'bGalGal1', 'dm6', 'fSalTru1', 'gorGor4', 'hg38', 'mRatBN7', 'mm10']\n"
-          ]
         }
       ],
       "source": [
         "# Load model\n",
-        "cfg = AutoConfig.from_pretrained(model_name, trust_remote_code=True, token=HF_TOKEN)\n",
-        "model = AutoModel.from_pretrained(model_name, trust_remote_code=True, token=HF_TOKEN).to(device)\n",
         "\n",
         "# Load tokenizer\n",
-        "tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, token=HF_TOKEN)\n",
         "\n",
         "# Load condition tokenizer\n",
-        "condition_tokenizer = AutoTokenizer.from_pretrained(\n",
-        "    model_name, subfolder=\"condition_tokenizer\", trust_remote_code=True, token=HF_TOKEN\n",
         ")\n",
         "\n",
         "# Set model to evaluation mode\n",
-        "model.eval()\n",
-        "\n",
-        "# Get assembly index\n",
-        "assemblies = list(cfg.bigwigs_per_file_assembly.keys())\n",
-        "print(\"Model supported assemblies:\", assemblies)\n",
-        "assembly_idx = torch.tensor([assemblies.index(assembly)])\n",
-        "\n",
-        "# Condition token (species)\n",
-        "condition = condition_tokenizer([\"human\"], return_tensors=\"pt\")\n",
-        "condition_ids = [condition[\"input_ids\"][0].to(device)]"
       ]
     },
     {
       "cell_type": "code",
-      "execution_count": 6,
       "id": "43154959",
       "metadata": {},
       "outputs": [
@@ -307,8 +452,7 @@
         "We pass:\n",
         "\n",
         "- `input_ids`: tokenized DNA window\n",
-        "- `condition_ids`: species tokens (`human`)\n",
-        "- `file_assembly_idx`: select the assembly (`hg38`)\n",
         "\n",
         "Outputs include:\n",
         "\n",
@@ -319,7 +463,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
       "id": "6765a9b9",
       "metadata": {},
       "outputs": [
@@ -338,13 +482,15 @@
         "batch = tokenizer([seq], add_special_tokens=False, return_tensors=\"pt\")\n",
         "input_ids = batch[\"input_ids\"].to(device)\n",
         "\n",
         "# Run inference\n",
         "out = model(\n",
         "    input_ids=input_ids,\n",
-        "    condition_ids=condition_ids,\n",
-        "    file_assembly_idx=assembly_idx,\n",
-        "    output_hidden_states=False,\n",
-        "    output_attentions=False,\n",
         ")\n",
         "\n",
         "# 7k human tracks over 37.5 % center region of the input sequence\n",
@@ -391,7 +537,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
       "id": "717539e2",
       "metadata": {},
       "outputs": [],

         "- **Genomic element annotations** (`bed_tracks_logits`): Classification predictions for genomic elements such as genes, exons, introns, splice sites, promoters, enhancers, and more\n",
         "- **Masked Language Model logits** (`logits`): Standard transformer language model outputs\n",
         "\n",
+        "> 💡 **Note:** Functional tracks and genomic element annotations are predicted only for the center 37.5% of the input sequence, where the model is more confident due to having full context on both sides.\n",
+        "\n",
         "## 📚 Notebook Structure\n",
         "\n",
         "1. **Setup**: Install dependencies and define the genomic window of interest\n",
         "- Supports the 24 species that NTv3 was post-trained on"
       ]
     },
     {
       "cell_type": "code",
       "execution_count": null,
     },
     {
       "cell_type": "code",
+      "execution_count": 7,
       "id": "608d67e1",
       "metadata": {},
       "outputs": [],
     },
     {
       "cell_type": "code",
+      "execution_count": 8,
       "id": "795a576f",
       "metadata": {},
       "outputs": [
         "# -----------------------------\n",
         "# User inputs\n",
         "# -----------------------------\n",
+        "model_name = \"InstaDeepAI/NTv3_100M_pos\" # options: \"InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb\" or \"InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb_v2\"\n",
         "\n",
         "# Example window from a given species (edit these) - needs to be multiple of 128 due to the model downsampling\n",
+        "species = \"human\"  # will use for condition the model on species\n",
+        "assembly = \"hg38\"  # will use for fetching the chromosome sequence\n",
         "chrom = \"chr19\"\n",
         "start = 6_700_000\n",
         "end   = 6_831_072\n",
         "\n",
+        "# Optional\n",
         "HF_TOKEN = os.getenv(\"HF_TOKEN\", None)\n",
         "\n",
         "assert end > start, \"end must be > start\"\n",
     },
     {
       "cell_type": "code",
+      "execution_count": 4,
       "id": "2354e2aa",
       "metadata": {},
       "outputs": [
           "name": "stdout",
           "output_type": "stream",
           "text": [
+            "Downloading: https://hgdownload.soe.ucsc.edu/goldenPath/hg38/chromosomes/chr19.fa.gz\n",
+            "Using downloaded chromosome FASTA: ./hg38/chr19.fa\n",
             "Sequence preview: GTCAACAATAACAAATGACATATTAGTAGTAAATTATAATTATACATTACAACAAAATTA...\n",
             "Valid DNA: True\n"
           ]
     },
     {
       "cell_type": "code",
+      "execution_count": 11,
       "id": "e09f0469",
       "metadata": {},
       "outputs": [
         {
+          "data": {
+            "text/plain": [
+              "NTv3Model(\n",
+              "  (core): Core(\n",
+              "    (embed_layer): Embedding(11, 16, padding_idx=1)\n",
+              "    (stem): Stem(\n",
+              "      (conv): Conv1d(16, 768, kernel_size=(15,), stride=(1,), padding=same)\n",
+              "    )\n",
+              "    (cond_tables): ModuleList(\n",
+              "      (0): Embedding(30, 16)\n",
+              "    )\n",
+              "    (conv_tower_blocks): ModuleList(\n",
+              "      (0-6): 7 x ConditionedConvTowerBlock(\n",
+              "        (conv): AdaptiveConvBlock(\n",
+              "          (conv): Conv1d(768, 768, kernel_size=(5,), stride=(1,), padding=same)\n",
+              "          (layer_norm): AdaptiveLayerNorm(\n",
+              "            (np.int64(768),), eps=1e-05, elementwise_affine=True\n",
+              "            (modulation_layers): ModuleList(\n",
+              "              (0): Linear(in_features=16, out_features=1536, bias=True)\n",
+              "            )\n",
+              "          )\n",
+              "        )\n",
+              "        (res_conv): AdaptiveResidualConvBlock(\n",
+              "          (conv_block): AdaptiveConvBlock(\n",
+              "            (conv): Conv1d(768, 768, kernel_size=(1,), stride=(1,), padding=same)\n",
+              "            (layer_norm): AdaptiveLayerNorm(\n",
+              "              (np.int64(768),), eps=1e-05, elementwise_affine=True\n",
+              "              (modulation_layers): ModuleList(\n",
+              "                (0): Linear(in_features=16, out_features=1536, bias=True)\n",
+              "              )\n",
+              "            )\n",
+              "          )\n",
+              "          (modulation_layers): ModuleList(\n",
+              "            (0): Linear(in_features=16, out_features=768, bias=True)\n",
+              "          )\n",
+              "        )\n",
+              "        (avg_pool): AvgPool1d(kernel_size=(2,), stride=(2,), padding=(0,))\n",
+              "      )\n",
+              "    )\n",
+              "    (transformer_blocks): ModuleList(\n",
+              "      (0-5): 6 x AdaptiveSelfAttentionBlock(\n",
+              "        (self_attention_layer_norm): AdaptiveLayerNorm(\n",
+              "          (768,), eps=1e-05, elementwise_affine=True\n",
+              "          (modulation_layers): ModuleList(\n",
+              "            (0): Linear(in_features=16, out_features=1536, bias=True)\n",
+              "          )\n",
+              "        )\n",
+              "        (final_layer_norm): AdaptiveLayerNorm(\n",
+              "          (768,), eps=1e-05, elementwise_affine=True\n",
+              "          (modulation_layers): ModuleList(\n",
+              "            (0): Linear(in_features=16, out_features=1536, bias=True)\n",
+              "          )\n",
+              "        )\n",
+              "        (sa_layer): MultiHeadAttention(\n",
+              "          (query_head): LinearProjectionHeInit(\n",
+              "            (linear): Linear(in_features=768, out_features=768, bias=True)\n",
+              "          )\n",
+              "          (key_head): LinearProjectionHeInit(\n",
+              "            (linear): Linear(in_features=768, out_features=768, bias=True)\n",
+              "          )\n",
+              "          (value_head): LinearProjectionHeInit(\n",
+              "            (linear): Linear(in_features=768, out_features=768, bias=True)\n",
+              "          )\n",
+              "          (mha_output): Linear(in_features=768, out_features=768, bias=True)\n",
+              "          (rotary_embedding): RotaryEmbedding()\n",
+              "        )\n",
+              "        (fc1): Linear(in_features=768, out_features=6144, bias=False)\n",
+              "        (fc2): Linear(in_features=3072, out_features=768, bias=False)\n",
+              "        (_ffn_activation_fn): SiLU()\n",
+              "        (modulation_layers): ModuleList(\n",
+              "          (0): Linear(in_features=16, out_features=768, bias=True)\n",
+              "        )\n",
+              "      )\n",
+              "    )\n",
+              "    (deconv_tower_blocks): ModuleList(\n",
+              "      (0-6): 7 x ConditionedDeConvTowerBlock(\n",
+              "        (conv): AdaptiveDeConvBlock(\n",
+              "          (conv): Conv1d(768, 768, kernel_size=(5,), stride=(1,), padding=same)\n",
+              "          (layer_norm): AdaptiveLayerNorm(\n",
+              "            (np.int64(768),), eps=1e-05, elementwise_affine=True\n",
+              "            (modulation_layers): ModuleList(\n",
+              "              (0): Linear(in_features=16, out_features=1536, bias=True)\n",
+              "            )\n",
+              "          )\n",
+              "        )\n",
+              "        (res_conv): AdaptiveResidualDeConvBlock(\n",
+              "          (conv_block): AdaptiveDeConvBlock(\n",
+              "            (conv): ConvTranspose1d(768, 768, kernel_size=(1,), stride=(1,))\n",
+              "            (layer_norm): AdaptiveLayerNorm(\n",
+              "              (np.int64(768),), eps=1e-05, elementwise_affine=True\n",
+              "              (modulation_layers): ModuleList(\n",
+              "                (0): Linear(in_features=16, out_features=1536, bias=True)\n",
+              "              )\n",
+              "            )\n",
+              "          )\n",
+              "          (modulation_layers): ModuleList(\n",
+              "            (0): Linear(in_features=16, out_features=768, bias=True)\n",
+              "          )\n",
+              "        )\n",
+              "      )\n",
+              "    )\n",
+              "    (bigwig_head): MultiSpeciesHead(\n",
+              "      (species_heads): ModuleList(\n",
+              "        (0-4): 5 x ZeroHead()\n",
+              "        (5): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=590, bias=True)\n",
+              "        )\n",
+              "        (6): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=319, bias=True)\n",
+              "        )\n",
+              "        (7): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=1392, bias=True)\n",
+              "        )\n",
+              "        (8): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=776, bias=True)\n",
+              "        )\n",
+              "        (9-12): 4 x ZeroHead()\n",
+              "        (13): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=1899, bias=True)\n",
+              "        )\n",
+              "        (14-15): 2 x ZeroHead()\n",
+              "        (16): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=921, bias=True)\n",
+              "        )\n",
+              "        (17): ZeroHead()\n",
+              "        (18): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=180, bias=True)\n",
+              "        )\n",
+              "        (19-20): 2 x ZeroHead()\n",
+              "        (21): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=7362, bias=True)\n",
+              "        )\n",
+              "        (22): ZeroHead()\n",
+              "        (23): LinearHead(\n",
+              "          (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "          (head): Linear(in_features=768, out_features=2450, bias=True)\n",
+              "        )\n",
+              "      )\n",
+              "    )\n",
+              "    (bed_head): ClassificationHead(\n",
+              "      (layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)\n",
+              "      (head): Linear(in_features=768, out_features=42, bias=True)\n",
+              "    )\n",
+              "    (conditions_heads): ModuleList(\n",
+              "      (0): Linear(in_features=768, out_features=30, bias=True)\n",
+              "    )\n",
+              "    (lm_head): ModuleDict(\n",
+              "      (hidden_layers): ModuleList()\n",
+              "      (head): Linear(in_features=768, out_features=11, bias=True)\n",
+              "    )\n",
+              "  )\n",
+              ")"
+            ]
+          },
+          "execution_count": 11,
+          "metadata": {},
+          "output_type": "execute_result"
         }
       ],
       "source": [
         "# Load model\n",
+        "cfg = AutoConfig.from_pretrained(model_name, trust_remote_code=True)\n",
+        "model = AutoModel.from_pretrained(model_name, trust_remote_code=True).to(device)\n",
         "\n",
         "# Load tokenizer\n",
+        "tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)\n",
         "\n",
         "# Load condition tokenizer\n",
+        "species_tokenizer = AutoTokenizer.from_pretrained(\n",
+        "    model_name, subfolder=\"species_tokenizer\", trust_remote_code=True,\n",
         ")\n",
         "\n",
         "# Set model to evaluation mode\n",
+        "model.eval()"
       ]
     },
     {
       "cell_type": "code",
+      "execution_count": 12,
       "id": "43154959",
       "metadata": {},
       "outputs": [
         "We pass:\n",
         "\n",
         "- `input_ids`: tokenized DNA window\n",
+        "- `species_ids`: species tokens (`human`)\n",
         "\n",
         "Outputs include:\n",
         "\n",
     },
     {
       "cell_type": "code",
+      "execution_count": 13,
       "id": "6765a9b9",
       "metadata": {},
       "outputs": [
         "batch = tokenizer([seq], add_special_tokens=False, return_tensors=\"pt\")\n",
         "input_ids = batch[\"input_ids\"].to(device)\n",
         "\n",
+        "# Condition tokens (e.g., species)\n",
+        "species = 'human'\n",
+        "species_ids = species_tokenizer([species] * len(batch['input_ids']), add_special_tokens=False, return_tensors='pt')\n",
+        "\n",
         "# Run inference\n",
         "out = model(\n",
         "    input_ids=input_ids,\n",
+        "    species_ids=species_ids['input_ids'],\n",
+        "    return_dict=True\n",
         ")\n",
         "\n",
         "# 7k human tracks over 37.5 % center region of the input sequence\n",
     },
     {
       "cell_type": "code",
+      "execution_count": 15,
       "id": "717539e2",
       "metadata": {},
       "outputs": [],