PIPES-M, a deep learning-based binary classifier designed to predict protease inhibitor (PI) activity from primary protein sequences.

PIPES-M is a fine-tuned sequence classification model built on the ESM-2 protein language model (EsmForSequenceClassification):

Base model: facebook/esm2_t30_150M_UR50D (150 million parameters, 30 layers)
Pre-trained on UniRef50 via masked language modeling

Fine-tuning was performed on a high-quality curated dataset comprising:

Positive examples: known protease inhibitors (<250 AA) from the MEROPS and Uniprot database
Negative examples: non-inhibitors selected from UniProt using sequence similarity and Pfam domain analysis

Training used sequence-only input, requiring no structural data. The classification head leverages evolutionary and physicochemical features encoded by ESM-2.

Maximum sequence length is fixed at 250 residues; longer sequences are truncated after 250 AA from the N-terminus, appropriate for the typical size range of small secreted inhibitors.

license: creativeml-openrail-m

Downloads last month: 12

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MuthuS97/PIPES-M

Base model

facebook/esm2_t30_150M_UR50D

Finetuned

(17)

this model