gravelcompbio commited on
Commit
99dcca9
·
verified ·
1 Parent(s): 76ad3f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -13
README.md CHANGED
@@ -13,24 +13,22 @@ tags:
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
15
 
 
 
 
 
 
 
16
  Post-Translational Modifications (PTMs) are a fundamental mechanism for regulating cellular functions and
17
  increasing the functional diversity of the proteome. Despite the identification of hundreds of unique PTMs
18
  through mass-spectrometry (MS) studies, accurately predicting many PTM types based on sequence data alone
19
- remains a significant challenge. Existing PTM prediction models predominantly focus on either single PTM
20
- types or employ ensemble methods that combine multiple models to predict different PTM types. This fragmentation
21
- is largely driven by the vast imbalance in data availability across PTM types making it difficult to predict
22
- multiple PTM types with a single model. To address this limitation, we present the Contrastively Learned
23
- Attention-Based Stratified PTM Predictor (CLASPP), a unified PTM prediction model. CLASPP overcomes
24
- data imbalance challenges by leveraging unsupervised clustering-based under-sampling and incorporating a novel
25
- contrastive learning framework tailored to PTM data. Drawing inspiration from advancements in image and
26
- natural language processing, the CLASPP model employs a multi-stage training strategy and utilizes a
27
- high-quality curated training dataset to improve PTM prediction accuracy compared to existing multi-PTM prediction
28
- models. Existing PTM prediction models predominantly focus on either single PTM types or employ ensemble methods
29
  that combine multiple models to predict different PTM types. This fragmentation is largely driven by the
30
  vast imbalance in data availability across PTM types making it difficult to predict multiple PTM types
31
- with a single model. To address this limitation, we present the
32
- Contrastively Learned Attention-Based Stratified PTM Predictor (CLASPP), a unified PTM prediction model.
33
-
34
 
35
 
36
  <p align="center">
 
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
15
 
16
+
17
+ CLASPP is a ESM2-150m protein lanuguage model that can pred PTM envents occuring on the substrate based
18
+ off primary protein sequence. This is done on multiple differnt PTM types (12) as a form of multi-label
19
+ classifcation. The encoder is training on a supervised Contrastive learing task then the classifcation
20
+ head is finetunted on the multi-label classifcation.
21
+
22
  Post-Translational Modifications (PTMs) are a fundamental mechanism for regulating cellular functions and
23
  increasing the functional diversity of the proteome. Despite the identification of hundreds of unique PTMs
24
  through mass-spectrometry (MS) studies, accurately predicting many PTM types based on sequence data alone
25
+ remains a significant challenge.
26
+
27
+ Existing PTM prediction models predominantly focus on either single PTM types or employ ensemble methods
 
 
 
 
 
 
 
28
  that combine multiple models to predict different PTM types. This fragmentation is largely driven by the
29
  vast imbalance in data availability across PTM types making it difficult to predict multiple PTM types
30
+ with a single model. To address this limitation, we present the Contrastively Learned Attention-Based
31
+ Stratified PTM Predictor (CLASPP), a unified PTM prediction model.
 
32
 
33
 
34
  <p align="center">