MTL Peptide Classifier (19 Tasks)
Model Description
Multi-Task Learning (MTL) peptide classifier for 19 UniDL4BioPep peptide activity datasets. Uses PDeepPP-inspired architecture with frozen ESM-2 backbone.
Performance
- Average Accuracy: 89.49%
- Average AUC: 94.15%
- Average PR-AUC: 92.85%
- Average MCC: 78.88%
Architecture
- Shared Encoder: Frozen ESM-2 (650M params) + learnable base embedding
- Feature Extraction: 4-layer Transformer + CNN (parallel)
- Task Heads: 19 binary classifiers (one per peptide activity)
The 19 Peptide Activity Tasks
- ACE_inhibitory - ACE inhibitory activity
- DPPIV_inhibitory - DPPIV inhibitory activity
- Bitter - Bitter taste peptides
- Umami - Umami taste peptides
- Antimicrobial - Antimicrobial activity
- Antimalarial - Antimalarial activity (main)
- Antimalarial_alt - Antimalarial activity (alternative)
- Quorum_sensing - Quorum sensing activity
- Anticancer - Anticancer activity (main)
- Anticancer_alt - Anticancer activity (alternative)
- AntiMRSA - Anti-MRSA strains activity
- TTCA - Therapeutic peptides for cancer
- BBP - Blood-Brain Barrier peptides
- Anti_parasitic - Anti-parasitic peptides
- NeuroPred - Neuroprotective peptides
- Antibacterial - Antibacterial peptides
- Antifungal - Antifungal peptides
- Antiviral - Antiviral peptides
- Toxicity - Toxicity prediction
Usage
from huggingface_hub import hf_hub_download
import torch
# Download model files
checkpoint_dir = "MTL-Peptide-Classifier-19tasks"
os.makedirs(checkpoint_dir, exist_ok=True)
for file in ["checkpoint.pt", "heads.pt", "shared_backbone.pt"]:
hf_hub_download(
repo_id="tuankg1028/MTL-Peptide-Classifier",
filename=file,
local_dir=checkpoint_dir
)
# Load model (requires mtl_peptide_classifier.py)
from mtl_peptide_classifier import MTLPeptideClassifier, get_all_peptide_tasks
from transformers import EsmTokenizer
tokenizer = EsmTokenizer.from_pretrained("facebook/esm2_t33_650M_UR50D")
task_configs = get_all_peptide_tasks("datasets")
model = MTLPeptideClassifier(
task_configs=task_configs,
hidden_dim=1280,
esm_ratio=0.9,
num_transformer_layers=4,
dropout=0.3
)
# Load checkpoint
device = "cuda" if torch.cuda.is_available() else "cpu"
backbone = torch.load(f"{checkpoint_dir}/shared_backbone.pt", map_location=device)
heads = torch.load(f"{checkpoint_dir}/heads.pt", map_location=device)
# Load weights
model.base_embed.load_state_dict(backbone['base_embed'])
model.transformer.load_state_dict(backbone['transformer'])
model.cnn.load_state_dict(backbone['cnn'])
model.layer_norm.load_state_dict(backbone['layer_norm'])
for name, head in model.heads.items():
if name in heads:
head.load_state_dict(heads[name])
model = model.to(device)
model.eval()
Requirements
torch>=2.0.0
transformers>=4.30.0
esm
numpy
pandas
scikit-learn
Training Details
- Base Model: facebook/esm2_t33_650M_UR50D (frozen)
- Training Datasets: UniDL4BioPep benchmark
- Batch Size: 16
- Learning Rate: 1e-4
- Epochs: 50
- Dropout: 0.3
- Mixed Precision: Enabled
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for tuankg1028/MTL-Peptide-Classifier
Base model
facebook/esm2_t33_650M_UR50D