MuthuS97 commited on
Commit
b406515
·
verified ·
1 Parent(s): b76b425

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -6,6 +6,23 @@ base_model:
6
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)]
7
  (https://colab.research.google.com/github/MuthuS97/PIPES-M/blob/main/PIPES-M.ipynb)
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
  license: creativeml-openrail-m
11
  ---
 
6
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)]
7
  (https://colab.research.google.com/github/MuthuS97/PIPES-M/blob/main/PIPES-M.ipynb)
8
 
9
+
10
+ **PIPES-M**, a deep learning-based binary classifier designed to predict protease inhibitor (PI) activity from primary protein sequences.
11
+
12
+
13
+ PIPES-M is a fine-tuned sequence classification model built on the **ESM-2** protein language model:
14
+ - Base model: `facebook/esm2_t30_150M_UR50D` (150 million parameters, 30 layers)
15
+ - Pre-trained on UniRef50 via masked language modeling
16
+
17
+ Fine-tuning was performed on a high-quality curated dataset comprising:
18
+ - Positive examples: known protease inhibitors (<250 AA) from the MEROPS database
19
+ - Negative examples: non-inhibitors selected from UniProt using sequence similarity and Pfam domain analysis
20
+
21
+ Training used sequence-only input, requiring no structural data. The classification head leverages evolutionary and physicochemical features encoded by ESM-2.
22
+
23
+ Maximum sequence length is fixed at 250 residues; longer sequences are truncated from the N-terminus, appropriate for the typical size range of small secreted inhibitors.
24
+
25
+
26
  ---
27
  license: creativeml-openrail-m
28
  ---