MuthuS97
/

PIP-BERT

MuthuS97 commited on Jan 11

Commit

38fca4d

verified ·

1 Parent(s): dd57259

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -25,7 +25,12 @@ ProtBERT-PI is a fine-tuned sequence classification model built on ProtBERT (Ber
     Base model: Rostlab/prot_bert
     Pre-trained on large corpora of protein sequences using masked language modeling
-Fine-tuning was performed on a curated dataset of known protease inhibitors and non-protease inhibitor negative set. Sequences are tokenized by inserting spaces between amino acids (standard for ProtBERT), enabling effective representation learning. Maximum sequence length is configurable (default: 250 AA); longer sequences are truncated.
 ---
 license: creativeml-openrail-m

     Base model: Rostlab/prot_bert
     Pre-trained on large corpora of protein sequences using masked language modeling
+Fine-tuning was performed on a curated dataset of known protease inhibitors and non-protease inhibitor negative set.
+Sequences are tokenized by inserting spaces between amino acids (standard for ProtBERT), enabling effective representation learning.
+Maximum sequence length is configurable (default: 250 AA); longer sequences are truncated.
+    Positive examples: known protease inhibitors (<250 AA) from the MEROPS database
+    Negative examples: non-inhibitors selected from UniProt using sequence similarity and Pfam domain analysis
 ---
 license: creativeml-openrail-m