raghavagps-group
/

anticp3

Text Classification

protein-classification

Model card Files Files and versions

anticp3 / README.md

AmishaG's picture

Update README.md

77efdcf verified 9 months ago

|

history blame contribute delete

2.29 kB

	---
	license: mit
	language:
	- en
	base_model:
	- facebook/esm2_t33_650M_UR50D
	tags:
	- protein-classification
	- bioinformatics
	- anticancer
	- esm2
	- transformers
	- torch
	---

	# ANTICP3: Anticancer Protein Prediction

	This model is a fine-tuned version of [`facebook/esm2-t33-650M-UR50D`](https://huggingface.co/facebook/esm2_t33_650M_UR50D) designed for binary classification of anticancer proteins (ACPs) from their primary sequence.

	> Developed by: [G. P. S. Raghava Lab, IIIT-Delhi](https://webs.iiitd.edu.in/raghava/)
	>
	> Model hosted by: [Dr. GPS Raghava's Group](https://huggingface.co/raghavagps-group/anticp3)

	---

	## Model Details

	\| Feature \| Description \|
	\|--------------------\|--------------------------------------------------------------\|
	\| Base Model \| [`facebook/esm2_t33_650M_UR50D`](https://huggingface.co/facebook/esm2_t33_650M_UR50D) \|
	\| Fine-tuned On \| Anticancer Protein Dataset \|
	\| Model Type \| Binary Classification \|
	\| Labels \| `0`: Non-Anticancer<br>`1`: Anticancer \|
	\| Framework \| [Transformers](https://huggingface.co/docs/transformers) + PyTorch \|
	\| Format \| `safetensors` \|

	---

	## Usage

	Use this model with the Hugging Face `transformers` library:

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load tokenizer and fine-tuned model
	tokenizer = AutoTokenizer.from_pretrained("raghavagps-group/anticp3")
	model = AutoModelForSequenceClassification.from_pretrained("raghavagps-group/anticp3")

	# Example protein sequence
	sequence = "MANCVVGYIGERCQYRDLKWWELRGGGGSGGGGSAPAFSVSPASGLSDGQSVSVSVSGAAAGETYYIAQCAPVGGQDACNPATATSFTTDASGAASFSFVVRKSYTGSTPEGTPVGSVDCATAACNLGAGNSGLDLGHVALTFGGGGGSGGGGSDHYNCVSSGGQCLYSACPIFTKIQGTCYRGKAKCCKLEHHHHHH"

	# Tokenize and run inference
	inputs = tokenizer(sequence, return_tensors="pt", truncation=True)

	with torch.no_grad():
	logits = model(**inputs).logits
	probs = torch.nn.functional.softmax(logits, dim=-1)
	prediction = torch.argmax(probs, dim=1).item()

	labels = {0: "Non-Anticancer", 1: "Anticancer"}
	print("Prediction:", labels[prediction])