Revised Peptide LGBM Model

This repository contains a PyTorch deep learning model trained to predict peptide properties from amino acid sequences.

Model Description

The model uses tokenized amino acid sequences as input and predicts a probability score indicating the likelihood of the peptide belonging to the positive class.

The architecture is defined in model/network.py and initialized using a YAML configuration file.

Input Representation

Sequences are tokenized using the following mapping:

Token	Description
PAD	Padding
UNK	Unknown
CLS	Start token
SEP	Separator
MASK	Mask token
L,A,G,V,E,S,I,K,R,D,T,P,N,Q,F,Y,M,H,C,W	Amino acids

Sequences are padded to the maximum length within a batch.

Files

File	Description
model.pt	Trained model checkpoint
config.yaml	Model configuration
tokenizer_mapping.json	Amino acid token mapping
inference.py	Example inference script

Usage

Example inference:

from inference import predict

sequence = "LAGVEST"
probability = predict(sequence)

print(probability)

Downloads last month: 12

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support