MuthuS97
/

PIPES-M

Model card Files Files and versions

PIPES-M / README.md

MuthuS97's picture

Update README.md

edfdea3 verified 2 days ago

|

history blame contribute delete

1.33 kB

	---
	license: creativeml-openrail-m
	base_model:
	- facebook/esm2_t30_150M_UR50D
	---
	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)]
	(https://colab.research.google.com/drive/1OoX9zDwdSD88UGXxlFctnq_UPcnkkWdp?usp=sharing)


	PIPES-M, a deep learning-based binary classifier designed to predict protease inhibitor (PI) activity from primary protein sequences.


	PIPES-M is a fine-tuned sequence classification model built on the ESM-2 protein language model (EsmForSequenceClassification):
	- Base model: `facebook/esm2_t30_150M_UR50D` (150 million parameters, 30 layers)
	- Pre-trained on UniRef50 via masked language modeling

	Fine-tuning was performed on a high-quality curated dataset comprising:
	- Positive examples: known protease inhibitors (<250 AA) from the MEROPS and Uniprot database
	- Negative examples: non-inhibitors selected from UniProt using sequence similarity and Pfam domain analysis

	Training used sequence-only input, requiring no structural data. The classification head leverages evolutionary and physicochemical features encoded by ESM-2.

	Maximum sequence length is fixed at 250 residues; longer sequences are truncated after 250 AA from the N-terminus, appropriate for the typical size range of small secreted inhibitors.


	---
	license: creativeml-openrail-m
	---