MuthuS97
/

PIPES-M

Safetensors

esm

Model card Files Files and versions

xet

Community

MuthuS97 commited on Jan 6

Commit

3b82921

verified ·

1 Parent(s): 1e3a48a

Upload PIPES_M.ipynb

Browse files

Files changed (1) hide show

PIPES_M.ipynb +1211 -0

PIPES_M.ipynb ADDED Viewed

	@@ -0,0 +1,1211 @@

+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "gpuType": "T4"
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    },
+    "accelerator": "GPU"
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# **## PIPES-M: Protease Inhibitor Prediction Using Evolutionary Scale Modeling (ESM-2)**\n",
+        "\n",
+        "## Overview\n",
+        "\n",
+        "This Google Colab notebook provides a user-friendly interface for inference with **PIPES-M**, a deep learning-based binary classifier designed to predict protease inhibitor (PI) activity from primary protein sequences.\n",
+        "\n",
+        "PIPES-M enables rapid screening of small secreted protease inhibitors (<250 amino acids) in large-scale genomic, transcriptomic, or proteomic datasets, where experimental validation is resource-intensive.\n",
+        "\n",
+        "The model assigns each input sequence to one of two classes:  \n",
+        "- **Positive (Potential PI)**: Predicted to exhibit protease inhibitor activity  \n",
+        "- **Negative (Non-PI)**: Predicted to lack protease inhibitor activity  \n",
+        "\n",
+        "Output includes:  \n",
+        "- Probability of the positive class (`prob_class_1`): ranges from 0 (low likelihood) to 1 (high likelihood of PI activity)  \n",
+        "- Confidence score: probability of the predicted class  \n",
+        "\n",
+        "## Model Architecture and Training\n",
+        "\n",
+        "PIPES-M is a fine-tuned sequence classification model built on the **ESM-2** protein language model:  \n",
+        "- Base model: `facebook/esm2_t30_150M_UR50D` (150 million parameters, 30 layers)  \n",
+        "- Pre-trained on UniRef50 via masked language modeling  \n",
+        "\n",
+        "Fine-tuning was performed on a high-quality curated dataset comprising:  \n",
+        "- Positive examples: known protease inhibitors (<250 AA) from the MEROPS database  \n",
+        "- Negative examples: non-inhibitors selected from UniProt using sequence similarity and Pfam domain analysis  \n",
+        "\n",
+        "Training used sequence-only input, requiring no structural data. The classification head leverages evolutionary and physicochemical features encoded by ESM-2.  \n",
+        "\n",
+        "Maximum sequence length is fixed at 250 residues; longer sequences are truncated from the N-terminus, appropriate for the typical size range of small secreted inhibitors.\n",
+        "\n",
+        "## Input Requirements\n",
+        "\n",
+        "- Multi-FASTA formatted file containing one or more protein sequences  \n",
+        "- Sequences must use standard single-letter amino acid codes  \n",
+        "- FASTA headers (lines beginning with `>`) are retained for identification  \n",
+        "\n",
+        "## Output Columns\n",
+        "\n",
+        "- `header`: Original FASTA identifier  \n",
+        "- `predicted_class`: \"Positive (Potential PI)\" or \"Negative (Non-PI)\"  \n",
+        "- `confidence`: Probability of the assigned class  \n",
+        "- `prob_class_1`: Raw probability of protease inhibitor activity  \n",
+        "- `prob_class_0`: Probability of the negative class  \n",
+        "\n",
+        "## Usage Notes\n",
+        "\n",
+        "- Intended for research and high-throughput screening  \n",
+        "- Positive predictions suggest potential PI activity and warrant experimental follow-up  \n",
+        "- Optimal performance is achieved on secreted or extracellular proteins, reflecting the composition of the training data  \n",
+        "- Predictions rely solely on the provided sequence; no homology search or multiple sequence alignment is performed  \n",
+        "\n",
+        "## Model Availability\n",
+        "\n",
+        "The fine-tuned PIPES-M model is publicly hosted on Hugging Face:  \n",
+        "https://huggingface.co/MuthuS97/PIPES-M\n",
+        "\n",
+        "## Citation\n",
+        "\n",
+        "When using PIPES-M in research, please reference the model repository and any associated forthcoming publication.\n",
+        "\n",
+        "---\n",
+        "\n",
+        "**Instructions**  \n",
+        "1. Enable GPU acceleration: Runtime → Change runtime type → Hardware accelerator → GPU (T4 recommended).  \n",
+        "2. Execute all cells in sequence (Runtime → Run all).  \n",
+        "3. Upload your multi-FASTA file in the designated section to obtain predictions."
+      ],
+      "metadata": {
+        "id": "HXIULYjtVADA"
+      }
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "nS8lo9EWRYQ5",
+        "outputId": "4e8008e9-7048-4377-a291-cbc2165293de"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Required packages installed successfully\n"
+          ]
+        }
+      ],
+      "source": [
+        "# @title 0. Install Required Packages\n",
+        "\n",
+        "!pip install --quiet transformers huggingface_hub\n",
+        "\n",
+        "print(\"Required packages installed successfully\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# @title 1. Initialization and Setup\n",
+        "\n",
+        "mount_drive = True  # @param {type:\"boolean\"}\n",
+        "if mount_drive:\n",
+        "    from google.colab import drive\n",
+        "    drive.mount('/content/drive')\n",
+        "    print(\"Google Drive mounted at /content/drive\")\n",
+        "\n",
+        "MAX_LEN = 250  # @param {type:\"integer\"}\n",
+        "BATCH_SIZE = 16  # @param {type:\"integer\"}\n",
+        "\n",
+        "import torch\n",
+        "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
+        "print(f\"Using device: {device}\")\n",
+        "\n",
+        "import pandas as pd\n",
+        "import numpy as np\n",
+        "from IPython.display import display, HTML\n",
+        "from google.colab import files\n",
+        "\n",
+        "print(\"Initialization complete\")"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "1-COdhW1Thl4",
+        "outputId": "f451fa6a-baa1-456d-81d1-a1b1b52d64e4"
+      },
+      "execution_count": 14,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n",
+            "Google Drive mounted at /content/drive\n",
+            "Using device: cuda\n",
+            "Initialization complete\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# @title 2. Load PIPES-M Model\n",
+        "\n",
+        "from transformers import AutoTokenizer, EsmForSequenceClassification\n",
+        "\n",
+        "MODEL_ID = \"MuthuS97/PIPES-M\"\n",
+        "\n",
+        "print(f\"Loading tokenizer and model from {MODEL_ID}\")\n",
+        "tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)\n",
+        "model = EsmForSequenceClassification.from_pretrained(MODEL_ID)\n",
+        "\n",
+        "model.to(device)\n",
+        "model.eval()\n",
+        "\n",
+        "print(\"Model loaded successfully\")"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "8FgPxVrQT_z_",
+        "outputId": "12fee169-e8d7-49f7-9812-5d7601aafa03"
+      },
+      "execution_count": 15,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Loading tokenizer and model from MuthuS97/PIPES-M\n",
+            "Model loaded successfully\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# @title 3. Upload Multi-FASTA File\n",
+        "\n",
+        "uploaded = files.upload()\n",
+        "\n",
+        "if not uploaded:\n",
+        "    raise ValueError(\"No file uploaded. Please provide a multi-FASTA file.\")\n",
+        "\n",
+        "fasta_filename = list(uploaded.keys())[0]\n",
+        "print(f\"Uploaded file: {fasta_filename}\")\n",
+        "\n",
+        "def parse_fasta(content):\n",
+        "    headers = []\n",
+        "    sequences = []\n",
+        "    current_seq = []\n",
+        "    current_header = None\n",
+        "\n",
+        "    for line in content.splitlines():\n",
+        "        line = line.strip()\n",
+        "        if line.startswith(\">\"):\n",
+        "            if current_header is not None:\n",
+        "                sequences.append(\"\".join(current_seq).upper().replace(\" \", \"\"))\n",
+        "                current_seq = []\n",
+        "            current_header = line[1:].strip()\n",
+        "            headers.append(current_header)\n",
+        "        else:\n",
+        "            if line:\n",
+        "                current_seq.append(line.upper().replace(\" \", \"\"))\n",
+        "\n",
+        "    if current_header is not None:\n",
+        "        sequences.append(\"\".join(current_seq).upper().replace(\" \", \"\"))\n",
+        "\n",
+        "    if len(headers) != len(sequences):\n",
+        "        raise ValueError(\"Parsing error: number of headers and sequences do not match\")\n",
+        "\n",
+        "    return pd.DataFrame({\"header\": headers, \"sequence\": sequences})\n",
+        "\n",
+        "with open(fasta_filename, \"r\") as f:\n",
+        "    fasta_content = f.read()\n",
+        "\n",
+        "df = parse_fasta(fasta_content)\n",
+        "print(f\"Loaded {len(df)} sequences\")\n",
+        "\n",
+        "long_seqs = df[df['sequence'].str.len() > MAX_LEN]\n",
+        "if len(long_seqs) > 0:\n",
+        "    print(f\"Warning: {len(long_seqs)} sequences exceed {MAX_LEN} residues and will be truncated\")\n",
+        "\n",
+        "display(df.head())"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 223
+        },
+        "id": "p_AfPGPNUQSU",
+        "outputId": "65cc14f7-943f-4a3c-bb46-47b52d427a74"
+      },
+      "execution_count": 16,
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<IPython.core.display.HTML object>"
+            ],
+            "text/html": [
+              "\n",
+              "     <input type=\"file\" id=\"files-c7fd2126-bf42-4ce3-87e1-f21e14a082bd\" name=\"files[]\" multiple disabled\n",
+              "        style=\"border:none\" />\n",
+              "     <output id=\"result-c7fd2126-bf42-4ce3-87e1-f21e14a082bd\">\n",
+              "      Upload widget is only available when the cell has been executed in the\n",
+              "      current browser session. Please rerun this cell to enable.\n",
+              "      </output>\n",
+              "      <script>// Copyright 2017 Google LLC\n",
+              "//\n",
+              "// Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+              "// you may not use this file except in compliance with the License.\n",
+              "// You may obtain a copy of the License at\n",
+              "//\n",
+              "//      http://www.apache.org/licenses/LICENSE-2.0\n",
+              "//\n",
+              "// Unless required by applicable law or agreed to in writing, software\n",
+              "// distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+              "// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+              "// See the License for the specific language governing permissions and\n",
+              "// limitations under the License.\n",
+              "\n",
+              "/**\n",
+              " * @fileoverview Helpers for google.colab Python module.\n",
+              " */\n",
+              "(function(scope) {\n",
+              "function span(text, styleAttributes = {}) {\n",
+              "  const element = document.createElement('span');\n",
+              "  element.textContent = text;\n",
+              "  for (const key of Object.keys(styleAttributes)) {\n",
+              "    element.style[key] = styleAttributes[key];\n",
+              "  }\n",
+              "  return element;\n",
+              "}\n",
+              "\n",
+              "// Max number of bytes which will be uploaded at a time.\n",
+              "const MAX_PAYLOAD_SIZE = 100 * 1024;\n",
+              "\n",
+              "function _uploadFiles(inputId, outputId) {\n",
+              "  const steps = uploadFilesStep(inputId, outputId);\n",
+              "  const outputElement = document.getElementById(outputId);\n",
+              "  // Cache steps on the outputElement to make it available for the next call\n",
+              "  // to uploadFilesContinue from Python.\n",
+              "  outputElement.steps = steps;\n",
+              "\n",
+              "  return _uploadFilesContinue(outputId);\n",
+              "}\n",
+              "\n",
+              "// This is roughly an async generator (not supported in the browser yet),\n",
+              "// where there are multiple asynchronous steps and the Python side is going\n",
+              "// to poll for completion of each step.\n",
+              "// This uses a Promise to block the python side on completion of each step,\n",
+              "// then passes the result of the previous step as the input to the next step.\n",
+              "function _uploadFilesContinue(outputId) {\n",
+              "  const outputElement = document.getElementById(outputId);\n",
+              "  const steps = outputElement.steps;\n",
+              "\n",
+              "  const next = steps.next(outputElement.lastPromiseValue);\n",
+              "  return Promise.resolve(next.value.promise).then((value) => {\n",
+              "    // Cache the last promise value to make it available to the next\n",
+              "    // step of the generator.\n",
+              "    outputElement.lastPromiseValue = value;\n",
+              "    return next.value.response;\n",
+              "  });\n",
+              "}\n",
+              "\n",
+              "/**\n",
+              " * Generator function which is called between each async step of the upload\n",
+              " * process.\n",
+              " * @param {string} inputId Element ID of the input file picker element.\n",
+              " * @param {string} outputId Element ID of the output display.\n",
+              " * @return {!Iterable<!Object>} Iterable of next steps.\n",
+              " */\n",
+              "function* uploadFilesStep(inputId, outputId) {\n",
+              "  const inputElement = document.getElementById(inputId);\n",
+              "  inputElement.disabled = false;\n",
+              "\n",
+              "  const outputElement = document.getElementById(outputId);\n",
+              "  outputElement.innerHTML = '';\n",
+              "\n",
+              "  const pickedPromise = new Promise((resolve) => {\n",
+              "    inputElement.addEventListener('change', (e) => {\n",
+              "      resolve(e.target.files);\n",
+              "    });\n",
+              "  });\n",
+              "\n",
+              "  const cancel = document.createElement('button');\n",
+              "  inputElement.parentElement.appendChild(cancel);\n",
+              "  cancel.textContent = 'Cancel upload';\n",
+              "  const cancelPromise = new Promise((resolve) => {\n",
+              "    cancel.onclick = () => {\n",
+              "      resolve(null);\n",
+              "    };\n",
+              "  });\n",
+              "\n",
+              "  // Wait for the user to pick the files.\n",
+              "  const files = yield {\n",
+              "    promise: Promise.race([pickedPromise, cancelPromise]),\n",
+              "    response: {\n",
+              "      action: 'starting',\n",
+              "    }\n",
+              "  };\n",
+              "\n",
+              "  cancel.remove();\n",
+              "\n",
+              "  // Disable the input element since further picks are not allowed.\n",
+              "  inputElement.disabled = true;\n",
+              "\n",
+              "  if (!files) {\n",
+              "    return {\n",
+              "      response: {\n",
+              "        action: 'complete',\n",
+              "      }\n",
+              "    };\n",
+              "  }\n",
+              "\n",
+              "  for (const file of files) {\n",
+              "    const li = document.createElement('li');\n",
+              "    li.append(span(file.name, {fontWeight: 'bold'}));\n",
+              "    li.append(span(\n",
+              "        `(${file.type || 'n/a'}) - ${file.size} bytes, ` +\n",
+              "        `last modified: ${\n",
+              "            file.lastModifiedDate ? file.lastModifiedDate.toLocaleDateString() :\n",
+              "                                    'n/a'} - `));\n",
+              "    const percent = span('0% done');\n",
+              "    li.appendChild(percent);\n",
+              "\n",
+              "    outputElement.appendChild(li);\n",
+              "\n",
+              "    const fileDataPromise = new Promise((resolve) => {\n",
+              "      const reader = new FileReader();\n",
+              "      reader.onload = (e) => {\n",
+              "        resolve(e.target.result);\n",
+              "      };\n",
+              "      reader.readAsArrayBuffer(file);\n",
+              "    });\n",
+              "    // Wait for the data to be ready.\n",
+              "    let fileData = yield {\n",
+              "      promise: fileDataPromise,\n",
+              "      response: {\n",
+              "        action: 'continue',\n",
+              "      }\n",
+              "    };\n",
+              "\n",
+              "    // Use a chunked sending to avoid message size limits. See b/62115660.\n",
+              "    let position = 0;\n",
+              "    do {\n",
+              "      const length = Math.min(fileData.byteLength - position, MAX_PAYLOAD_SIZE);\n",
+              "      const chunk = new Uint8Array(fileData, position, length);\n",
+              "      position += length;\n",
+              "\n",
+              "      const base64 = btoa(String.fromCharCode.apply(null, chunk));\n",
+              "      yield {\n",
+              "        response: {\n",
+              "          action: 'append',\n",
+              "          file: file.name,\n",
+              "          data: base64,\n",
+              "        },\n",
+              "      };\n",
+              "\n",
+              "      let percentDone = fileData.byteLength === 0 ?\n",
+              "          100 :\n",
+              "          Math.round((position / fileData.byteLength) * 100);\n",
+              "      percent.textContent = `${percentDone}% done`;\n",
+              "\n",
+              "    } while (position < fileData.byteLength);\n",
+              "  }\n",
+              "\n",
+              "  // All done.\n",
+              "  yield {\n",
+              "    response: {\n",
+              "      action: 'complete',\n",
+              "    }\n",
+              "  };\n",
+              "}\n",
+              "\n",
+              "scope.google = scope.google || {};\n",
+              "scope.google.colab = scope.google.colab || {};\n",
+              "scope.google.colab._files = {\n",
+              "  _uploadFiles,\n",
+              "  _uploadFilesContinue,\n",
+              "};\n",
+              "})(self);\n",
+              "</script> "
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Saving rcsb_pdb_6TME.fasta to rcsb_pdb_6TME.fasta\n",
+            "Uploaded file: rcsb_pdb_6TME.fasta\n",
+            "Loaded 2 sequences\n",
+            "Warning: 1 sequences exceed 250 residues and will be truncated\n"
+          ]
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "                                              header  \\\n",
+              "0  6TME_1|Chains A, B|Pollen-specific leucine-ric...   \n",
+              "1  6TME_2|Chains C, D|Protein RALF-like 4|Arabido...   \n",
+              "\n",
+              "                                            sequence  \n",
+              "0  MELTDEEASFLTRRQLLALSENGDLPDDIEYEVDLDLKFANNRLKR...  \n",
+              "1  ARGRRYIGYDALKKNNVPCSRRGRSYYDCKKRRRNNPYRRGCSAIT...  "
+            ],
+            "text/html": [
+              "\n",
+              "  <div id=\"df-360f9a8b-05ae-47e8-af43-319d0dbf4606\" class=\"colab-df-container\">\n",
+              "    <div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>header</th>\n",
+              "      <th>sequence</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>6TME_1|Chains A, B|Pollen-specific leucine-ric...</td>\n",
+              "      <td>MELTDEEASFLTRRQLLALSENGDLPDDIEYEVDLDLKFANNRLKR...</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>6TME_2|Chains C, D|Protein RALF-like 4|Arabido...</td>\n",
+              "      <td>ARGRRYIGYDALKKNNVPCSRRGRSYYDCKKRRRNNPYRRGCSAIT...</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>\n",
+              "    <div class=\"colab-df-buttons\">\n",
+              "\n",
+              "  <div class=\"colab-df-container\">\n",
+              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-360f9a8b-05ae-47e8-af43-319d0dbf4606')\"\n",
+              "            title=\"Convert this dataframe to an interactive table.\"\n",
+              "            style=\"display:none;\">\n",
+              "\n",
+              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
+              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
+              "  </svg>\n",
+              "    </button>\n",
+              "\n",
+              "  <style>\n",
+              "    .colab-df-container {\n",
+              "      display:flex;\n",
+              "      gap: 12px;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-convert {\n",
+              "      background-color: #E8F0FE;\n",
+              "      border: none;\n",
+              "      border-radius: 50%;\n",
+              "      cursor: pointer;\n",
+              "      display: none;\n",
+              "      fill: #1967D2;\n",
+              "      height: 32px;\n",
+              "      padding: 0 0 0 0;\n",
+              "      width: 32px;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-convert:hover {\n",
+              "      background-color: #E2EBFA;\n",
+              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
+              "      fill: #174EA6;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-buttons div {\n",
+              "      margin-bottom: 4px;\n",
+              "    }\n",
+              "\n",
+              "    [theme=dark] .colab-df-convert {\n",
+              "      background-color: #3B4455;\n",
+              "      fill: #D2E3FC;\n",
+              "    }\n",
+              "\n",
+              "    [theme=dark] .colab-df-convert:hover {\n",
+              "      background-color: #434B5C;\n",
+              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
+              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
+              "      fill: #FFFFFF;\n",
+              "    }\n",
+              "  </style>\n",
+              "\n",
+              "    <script>\n",
+              "      const buttonEl =\n",
+              "        document.querySelector('#df-360f9a8b-05ae-47e8-af43-319d0dbf4606 button.colab-df-convert');\n",
+              "      buttonEl.style.display =\n",
+              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
+              "\n",
+              "      async function convertToInteractive(key) {\n",
+              "        const element = document.querySelector('#df-360f9a8b-05ae-47e8-af43-319d0dbf4606');\n",
+              "        const dataTable =\n",
+              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
+              "                                                    [key], {});\n",
+              "        if (!dataTable) return;\n",
+              "\n",
+              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
+              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
+              "          + ' to learn more about interactive tables.';\n",
+              "        element.innerHTML = '';\n",
+              "        dataTable['output_type'] = 'display_data';\n",
+              "        await google.colab.output.renderOutput(dataTable, element);\n",
+              "        const docLink = document.createElement('div');\n",
+              "        docLink.innerHTML = docLinkHtml;\n",
+              "        element.appendChild(docLink);\n",
+              "      }\n",
+              "    </script>\n",
+              "  </div>\n",
+              "\n",
+              "\n",
+              "    <div id=\"df-e4f19412-ddac-41a8-b87a-879d75400e74\">\n",
+              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-e4f19412-ddac-41a8-b87a-879d75400e74')\"\n",
+              "                title=\"Suggest charts\"\n",
+              "                style=\"display:none;\">\n",
+              "\n",
+              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
+              "     width=\"24px\">\n",
+              "    <g>\n",
+              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
+              "    </g>\n",
+              "</svg>\n",
+              "      </button>\n",
+              "\n",
+              "<style>\n",
+              "  .colab-df-quickchart {\n",
+              "      --bg-color: #E8F0FE;\n",
+              "      --fill-color: #1967D2;\n",
+              "      --hover-bg-color: #E2EBFA;\n",
+              "      --hover-fill-color: #174EA6;\n",
+              "      --disabled-fill-color: #AAA;\n",
+              "      --disabled-bg-color: #DDD;\n",
+              "  }\n",
+              "\n",
+              "  [theme=dark] .colab-df-quickchart {\n",
+              "      --bg-color: #3B4455;\n",
+              "      --fill-color: #D2E3FC;\n",
+              "      --hover-bg-color: #434B5C;\n",
+              "      --hover-fill-color: #FFFFFF;\n",
+              "      --disabled-bg-color: #3B4455;\n",
+              "      --disabled-fill-color: #666;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart {\n",
+              "    background-color: var(--bg-color);\n",
+              "    border: none;\n",
+              "    border-radius: 50%;\n",
+              "    cursor: pointer;\n",
+              "    display: none;\n",
+              "    fill: var(--fill-color);\n",
+              "    height: 32px;\n",
+              "    padding: 0;\n",
+              "    width: 32px;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart:hover {\n",
+              "    background-color: var(--hover-bg-color);\n",
+              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
+              "    fill: var(--button-hover-fill-color);\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart-complete:disabled,\n",
+              "  .colab-df-quickchart-complete:disabled:hover {\n",
+              "    background-color: var(--disabled-bg-color);\n",
+              "    fill: var(--disabled-fill-color);\n",
+              "    box-shadow: none;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-spinner {\n",
+              "    border: 2px solid var(--fill-color);\n",
+              "    border-color: transparent;\n",
+              "    border-bottom-color: var(--fill-color);\n",
+              "    animation:\n",
+              "      spin 1s steps(1) infinite;\n",
+              "  }\n",
+              "\n",
+              "  @keyframes spin {\n",
+              "    0% {\n",
+              "      border-color: transparent;\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "      border-left-color: var(--fill-color);\n",
+              "    }\n",
+              "    20% {\n",
+              "      border-color: transparent;\n",
+              "      border-left-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "    }\n",
+              "    30% {\n",
+              "      border-color: transparent;\n",
+              "      border-left-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "      border-right-color: var(--fill-color);\n",
+              "    }\n",
+              "    40% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "    }\n",
+              "    60% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "    }\n",
+              "    80% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "    }\n",
+              "    90% {\n",
+              "      border-color: transparent;\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "    }\n",
+              "  }\n",
+              "</style>\n",
+              "\n",
+              "      <script>\n",
+              "        async function quickchart(key) {\n",
+              "          const quickchartButtonEl =\n",
+              "            document.querySelector('#' + key + ' button');\n",
+              "          quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
+              "          quickchartButtonEl.classList.add('colab-df-spinner');\n",
+              "          try {\n",
+              "            const charts = await google.colab.kernel.invokeFunction(\n",
+              "                'suggestCharts', [key], {});\n",
+              "          } catch (error) {\n",
+              "            console.error('Error during call to suggestCharts:', error);\n",
+              "          }\n",
+              "          quickchartButtonEl.classList.remove('colab-df-spinner');\n",
+              "          quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
+              "        }\n",
+              "        (() => {\n",
+              "          let quickchartButtonEl =\n",
+              "            document.querySelector('#df-e4f19412-ddac-41a8-b87a-879d75400e74 button');\n",
+              "          quickchartButtonEl.style.display =\n",
+              "            google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
+              "        })();\n",
+              "      </script>\n",
+              "    </div>\n",
+              "\n",
+              "    </div>\n",
+              "  </div>\n"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "dataframe",
+              "summary": "{\n  \"name\": \"display(df\",\n  \"rows\": 2,\n  \"fields\": [\n    {\n      \"column\": \"header\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          \"6TME_2|Chains C, D|Protein RALF-like 4|Arabidopsis thaliana (3702)\",\n          \"6TME_1|Chains A, B|Pollen-specific leucine-rich repeat extensin-like protein 1|Arabidopsis thaliana (3702)\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"sequence\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          \"ARGRRYIGYDALKKNNVPCSRRGRSYYDCKKRRRNNPYRRGCSAITHCYR\",\n          \"MELTDEEASFLTRRQLLALSENGDLPDDIEYEVDLDLKFANNRLKRAYIALQAWKKAFYSDPFNTAANWVGPDVCSYKGVFCAPALDDPSVLVVAGIDLNHADIFGYLPPELGLLTDVALFHVNSNRFCGVIPKSLSKLTLMYEFDVSNNRFVGPFPTVALSWPSLKFLDIRYNDFEGKLPPEIFDKDLDAIFLNNNRFESTIPETIGKSTASVVTFAHNKFSGCIPKTIGQMKNLNEIVFIGNNLSGCLPNEIGSLNNVTVFDASSNGFVGSLPSTLSGLANVEQMDFSYNKFTGFVTDNICKLPKLSNFTFSYNFFNGEAQSCVPGSSQEKQFDDTSNCLQNRPNQKSAKECLPVVSRPVDCSKDKCAGG\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}"
+            }
+          },
+          "metadata": {}
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# @title 4. Run Inference\n",
+        "\n",
+        "from torch.utils.data import DataLoader, TensorDataset\n",
+        "\n",
+        "print(\"Tokenizing sequences\")\n",
+        "sequences = df['sequence'].tolist()\n",
+        "encoded = tokenizer(\n",
+        "    sequences,\n",
+        "    padding=True,\n",
+        "    truncation=True,\n",
+        "    max_length=MAX_LEN,\n",
+        "    return_tensors=\"pt\"\n",
+        ")\n",
+        "\n",
+        "dataset = TensorDataset(encoded['input_ids'], encoded['attention_mask'])\n",
+        "dataloader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=False)\n",
+        "\n",
+        "all_probs = []\n",
+        "all_preds = []\n",
+        "\n",
+        "print(\"Running inference\")\n",
+        "with torch.no_grad():\n",
+        "    for i, batch in enumerate(dataloader):\n",
+        "        input_ids, attention_mask = [b.to(device) for b in batch]\n",
+        "        outputs = model(input_ids=input_ids, attention_mask=attention_mask)\n",
+        "        logits = outputs.logits\n",
+        "        probs = torch.softmax(logits, dim=1).cpu().numpy()\n",
+        "        preds = np.argmax(probs, axis=1)\n",
+        "        all_probs.extend(probs)\n",
+        "        all_preds.extend(preds)\n",
+        "\n",
+        "        if (i + 1) % 10 == 0 or (i + 1) == len(dataloader):\n",
+        "            processed = min((i + 1) * BATCH_SIZE, len(sequences))\n",
+        "            print(f\"Processed {processed} of {len(sequences)} sequences\")\n",
+        "\n",
+        "print(\"Inference completed\")"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "nwHd1DRVUn_e",
+        "outputId": "96ebfb56-ae1c-4254-8476-c0814b924b13"
+      },
+      "execution_count": 17,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Tokenizing sequences\n",
+            "Running inference\n",
+            "Processed 2 of 2 sequences\n",
+            "Inference completed\n"
+          ]
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# @title 5. Results and Download\n",
+        "\n",
+        "confidence = [p[pred] for p, pred in zip(all_probs, all_preds)]\n",
+        "df['predicted_class_id'] = all_preds\n",
+        "df['confidence'] = confidence\n",
+        "df['prob_class_0'] = [p[0] for p in all_probs]\n",
+        "df['prob_class_1'] = [p[1] for p in all_probs]\n",
+        "\n",
+        "df['predicted_class'] = df['predicted_class_id'].map({\n",
+        "    0: \"Negative (Non-PI)\",\n",
+        "    1: \"Positive (Potential PI)\"\n",
+        "})\n",
+        "\n",
+        "display(HTML(\"<h3>Prediction Results (first 10 sequences)</h3>\"))\n",
+        "display(df[['header', 'predicted_class', 'confidence', 'prob_class_1']].head(10))\n",
+        "\n",
+        "print(\"\\nClass distribution\")\n",
+        "counts = df['predicted_class'].value_counts()\n",
+        "for label, count in counts.items():\n",
+        "    percentage = count / len(df) * 100\n",
+        "    print(f\"{label}: {count} sequences ({percentage:.1f}%)\")\n",
+        "\n",
+        "output_csv = \"PIPES-M_predictions.csv\"\n",
+        "df.to_csv(output_csv, index=False)\n",
+        "\n",
+        "if mount_drive:\n",
+        "    drive_path = \"/content/drive/MyDrive/PIPES-M_predictions.csv\"\n",
+        "    df.to_csv(drive_path, index=False)\n",
+        "    print(f\"\\nResults also saved to Google Drive: {drive_path}\")\n",
+        "\n",
+        "print(f\"\\nResults saved as {output_csv}\")\n",
+        "files.download(output_csv)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 278
+        },
+        "id": "A3fPg8TaUu2k",
+        "outputId": "bdd02de6-60a6-4236-d09b-e7af9319fc8e"
+      },
+      "execution_count": 18,
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<IPython.core.display.HTML object>"
+            ],
+            "text/html": [
+              "<h3>Prediction Results (first 10 sequences)</h3>"
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "                                              header          predicted_class  \\\n",
+              "0  6TME_1|Chains A, B|Pollen-specific leucine-ric...  Positive (Potential PI)   \n",
+              "1  6TME_2|Chains C, D|Protein RALF-like 4|Arabido...  Positive (Potential PI)   \n",
+              "\n",
+              "   confidence  prob_class_1  \n",
+              "0    0.947041      0.947041  \n",
+              "1    0.965963      0.965963  "
+            ],
+            "text/html": [
+              "\n",
+              "  <div id=\"df-10af0c1e-3834-4264-8a23-78bb419eb305\" class=\"colab-df-container\">\n",
+              "    <div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>header</th>\n",
+              "      <th>predicted_class</th>\n",
+              "      <th>confidence</th>\n",
+              "      <th>prob_class_1</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>6TME_1|Chains A, B|Pollen-specific leucine-ric...</td>\n",
+              "      <td>Positive (Potential PI)</td>\n",
+              "      <td>0.947041</td>\n",
+              "      <td>0.947041</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>6TME_2|Chains C, D|Protein RALF-like 4|Arabido...</td>\n",
+              "      <td>Positive (Potential PI)</td>\n",
+              "      <td>0.965963</td>\n",
+              "      <td>0.965963</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>\n",
+              "    <div class=\"colab-df-buttons\">\n",
+              "\n",
+              "  <div class=\"colab-df-container\">\n",
+              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-10af0c1e-3834-4264-8a23-78bb419eb305')\"\n",
+              "            title=\"Convert this dataframe to an interactive table.\"\n",
+              "            style=\"display:none;\">\n",
+              "\n",
+              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
+              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
+              "  </svg>\n",
+              "    </button>\n",
+              "\n",
+              "  <style>\n",
+              "    .colab-df-container {\n",
+              "      display:flex;\n",
+              "      gap: 12px;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-convert {\n",
+              "      background-color: #E8F0FE;\n",
+              "      border: none;\n",
+              "      border-radius: 50%;\n",
+              "      cursor: pointer;\n",
+              "      display: none;\n",
+              "      fill: #1967D2;\n",
+              "      height: 32px;\n",
+              "      padding: 0 0 0 0;\n",
+              "      width: 32px;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-convert:hover {\n",
+              "      background-color: #E2EBFA;\n",
+              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
+              "      fill: #174EA6;\n",
+              "    }\n",
+              "\n",
+              "    .colab-df-buttons div {\n",
+              "      margin-bottom: 4px;\n",
+              "    }\n",
+              "\n",
+              "    [theme=dark] .colab-df-convert {\n",
+              "      background-color: #3B4455;\n",
+              "      fill: #D2E3FC;\n",
+              "    }\n",
+              "\n",
+              "    [theme=dark] .colab-df-convert:hover {\n",
+              "      background-color: #434B5C;\n",
+              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
+              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
+              "      fill: #FFFFFF;\n",
+              "    }\n",
+              "  </style>\n",
+              "\n",
+              "    <script>\n",
+              "      const buttonEl =\n",
+              "        document.querySelector('#df-10af0c1e-3834-4264-8a23-78bb419eb305 button.colab-df-convert');\n",
+              "      buttonEl.style.display =\n",
+              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
+              "\n",
+              "      async function convertToInteractive(key) {\n",
+              "        const element = document.querySelector('#df-10af0c1e-3834-4264-8a23-78bb419eb305');\n",
+              "        const dataTable =\n",
+              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
+              "                                                    [key], {});\n",
+              "        if (!dataTable) return;\n",
+              "\n",
+              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
+              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
+              "          + ' to learn more about interactive tables.';\n",
+              "        element.innerHTML = '';\n",
+              "        dataTable['output_type'] = 'display_data';\n",
+              "        await google.colab.output.renderOutput(dataTable, element);\n",
+              "        const docLink = document.createElement('div');\n",
+              "        docLink.innerHTML = docLinkHtml;\n",
+              "        element.appendChild(docLink);\n",
+              "      }\n",
+              "    </script>\n",
+              "  </div>\n",
+              "\n",
+              "\n",
+              "    <div id=\"df-eb297cd7-50c1-4731-af63-06b835a7286f\">\n",
+              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-eb297cd7-50c1-4731-af63-06b835a7286f')\"\n",
+              "                title=\"Suggest charts\"\n",
+              "                style=\"display:none;\">\n",
+              "\n",
+              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
+              "     width=\"24px\">\n",
+              "    <g>\n",
+              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
+              "    </g>\n",
+              "</svg>\n",
+              "      </button>\n",
+              "\n",
+              "<style>\n",
+              "  .colab-df-quickchart {\n",
+              "      --bg-color: #E8F0FE;\n",
+              "      --fill-color: #1967D2;\n",
+              "      --hover-bg-color: #E2EBFA;\n",
+              "      --hover-fill-color: #174EA6;\n",
+              "      --disabled-fill-color: #AAA;\n",
+              "      --disabled-bg-color: #DDD;\n",
+              "  }\n",
+              "\n",
+              "  [theme=dark] .colab-df-quickchart {\n",
+              "      --bg-color: #3B4455;\n",
+              "      --fill-color: #D2E3FC;\n",
+              "      --hover-bg-color: #434B5C;\n",
+              "      --hover-fill-color: #FFFFFF;\n",
+              "      --disabled-bg-color: #3B4455;\n",
+              "      --disabled-fill-color: #666;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart {\n",
+              "    background-color: var(--bg-color);\n",
+              "    border: none;\n",
+              "    border-radius: 50%;\n",
+              "    cursor: pointer;\n",
+              "    display: none;\n",
+              "    fill: var(--fill-color);\n",
+              "    height: 32px;\n",
+              "    padding: 0;\n",
+              "    width: 32px;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart:hover {\n",
+              "    background-color: var(--hover-bg-color);\n",
+              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
+              "    fill: var(--button-hover-fill-color);\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-quickchart-complete:disabled,\n",
+              "  .colab-df-quickchart-complete:disabled:hover {\n",
+              "    background-color: var(--disabled-bg-color);\n",
+              "    fill: var(--disabled-fill-color);\n",
+              "    box-shadow: none;\n",
+              "  }\n",
+              "\n",
+              "  .colab-df-spinner {\n",
+              "    border: 2px solid var(--fill-color);\n",
+              "    border-color: transparent;\n",
+              "    border-bottom-color: var(--fill-color);\n",
+              "    animation:\n",
+              "      spin 1s steps(1) infinite;\n",
+              "  }\n",
+              "\n",
+              "  @keyframes spin {\n",
+              "    0% {\n",
+              "      border-color: transparent;\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "      border-left-color: var(--fill-color);\n",
+              "    }\n",
+              "    20% {\n",
+              "      border-color: transparent;\n",
+              "      border-left-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "    }\n",
+              "    30% {\n",
+              "      border-color: transparent;\n",
+              "      border-left-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "      border-right-color: var(--fill-color);\n",
+              "    }\n",
+              "    40% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "      border-top-color: var(--fill-color);\n",
+              "    }\n",
+              "    60% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "    }\n",
+              "    80% {\n",
+              "      border-color: transparent;\n",
+              "      border-right-color: var(--fill-color);\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "    }\n",
+              "    90% {\n",
+              "      border-color: transparent;\n",
+              "      border-bottom-color: var(--fill-color);\n",
+              "    }\n",
+              "  }\n",
+              "</style>\n",
+              "\n",
+              "      <script>\n",
+              "        async function quickchart(key) {\n",
+              "          const quickchartButtonEl =\n",
+              "            document.querySelector('#' + key + ' button');\n",
+              "          quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
+              "          quickchartButtonEl.classList.add('colab-df-spinner');\n",
+              "          try {\n",
+              "            const charts = await google.colab.kernel.invokeFunction(\n",
+              "                'suggestCharts', [key], {});\n",
+              "          } catch (error) {\n",
+              "            console.error('Error during call to suggestCharts:', error);\n",
+              "          }\n",
+              "          quickchartButtonEl.classList.remove('colab-df-spinner');\n",
+              "          quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
+              "        }\n",
+              "        (() => {\n",
+              "          let quickchartButtonEl =\n",
+              "            document.querySelector('#df-eb297cd7-50c1-4731-af63-06b835a7286f button');\n",
+              "          quickchartButtonEl.style.display =\n",
+              "            google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
+              "        })();\n",
+              "      </script>\n",
+              "    </div>\n",
+              "\n",
+              "    </div>\n",
+              "  </div>\n"
+            ],
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "dataframe",
+              "summary": "{\n  \"name\": \"files\",\n  \"rows\": 2,\n  \"fields\": [\n    {\n      \"column\": \"header\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          \"6TME_2|Chains C, D|Protein RALF-like 4|Arabidopsis thaliana (3702)\",\n          \"6TME_1|Chains A, B|Pollen-specific leucine-rich repeat extensin-like protein 1|Arabidopsis thaliana (3702)\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"predicted_class\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 1,\n        \"samples\": [\n          \"Positive (Potential PI)\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"confidence\",\n      \"properties\": {\n        \"dtype\": \"float32\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          0.9659631848335266\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"prob_class_1\",\n      \"properties\": {\n        \"dtype\": \"float32\",\n        \"num_unique_values\": 2,\n        \"samples\": [\n          0.9659631848335266\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "\n",
+            "Class distribution\n",
+            "Positive (Potential PI): 2 sequences (100.0%)\n",
+            "\n",
+            "Results also saved to Google Drive: /content/drive/MyDrive/PIPES-M_predictions.csv\n",
+            "\n",
+            "Results saved as PIPES-M_predictions.csv\n"
+          ]
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<IPython.core.display.Javascript object>"
+            ],
+            "application/javascript": [
+              "\n",
+              "    async function download(id, filename, size) {\n",
+              "      if (!google.colab.kernel.accessAllowed) {\n",
+              "        return;\n",
+              "      }\n",
+              "      const div = document.createElement('div');\n",
+              "      const label = document.createElement('label');\n",
+              "      label.textContent = `Downloading \"${filename}\": `;\n",
+              "      div.appendChild(label);\n",
+              "      const progress = document.createElement('progress');\n",
+              "      progress.max = size;\n",
+              "      div.appendChild(progress);\n",
+              "      document.body.appendChild(div);\n",
+              "\n",
+              "      const buffers = [];\n",
+              "      let downloaded = 0;\n",
+              "\n",
+              "      const channel = await google.colab.kernel.comms.open(id);\n",
+              "      // Send a message to notify the kernel that we're ready.\n",
+              "      channel.send({})\n",
+              "\n",
+              "      for await (const message of channel.messages) {\n",
+              "        // Send a message to notify the kernel that we're ready.\n",
+              "        channel.send({})\n",
+              "        if (message.buffers) {\n",
+              "          for (const buffer of message.buffers) {\n",
+              "            buffers.push(buffer);\n",
+              "            downloaded += buffer.byteLength;\n",
+              "            progress.value = downloaded;\n",
+              "          }\n",
+              "        }\n",
+              "      }\n",
+              "      const blob = new Blob(buffers, {type: 'application/binary'});\n",
+              "      const a = document.createElement('a');\n",
+              "      a.href = window.URL.createObjectURL(blob);\n",
+              "      a.download = filename;\n",
+              "      div.appendChild(a);\n",
+              "      a.click();\n",
+              "      div.remove();\n",
+              "    }\n",
+              "  "
+            ]
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "<IPython.core.display.Javascript object>"
+            ],
+            "application/javascript": [
+              "download(\"download_b408fcdf-a1a5-4daf-973f-d965a8b95af4\", \"PIPES-M_predictions.csv\", 807)"
+            ]
+          },
+          "metadata": {}
+        }
+      ]
+    }
+  ]
+}