fondress commited on
Commit
211d02d
·
verified ·
1 Parent(s): db04b68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -20,19 +20,21 @@ The model is trained with a loss function that combines classification loss and
20
 
21
  ## Intended uses
22
 
23
- `PDeepPP` is designed for two primary tasks:
24
 
25
- 1. **PTM site prediction**: Identifying post-translational modification sites (e.g., phosphorylation) in protein sequences, focusing on serine (S), threonine (T), and tyrosine (Y) residues.
26
- 2. **Biologically active sequence analysis (BPS)**: Extracting biologically relevant regions from protein sequences for downstream analysis.
27
 
28
- The model processes protein sequences and outputs:
29
 
30
- - Embedded representations of the sequences, which can be used for various downstream tasks.
31
- - Predicted probabilities for PTM or other sequence-specific features.
 
32
 
33
- ### Key features:
34
- - **PTM mode**: Focuses on sequences centered around specific residues (S, T, Y) to predict PTM activity.
35
- - **BPS mode**: Analyzes overlapping or non-overlapping subsequences of a protein for broader biological insights.
 
36
 
37
  ## How to use
38
 
 
20
 
21
  ## Intended uses
22
 
23
+ `PDeepPP` was developed and validated using PTM and BPS datasets, but its applications are not limited to these specific tasks. Leveraging its flexible architecture and robust feature extraction capabilities, `PDeepPP` can be applied to a wide range of protein sequence-related analysis tasks. Specifically, the model has been validated on the following datasets:
24
 
25
+ 1. **PTM datasets**: Used for predicting post-translational modification (PTM) sites (e.g., phosphorylation), focusing on serine (S), threonine (T), and tyrosine (Y) residues.
26
+ 2. **BPS datasets**: Used for analyzing biologically active regions of protein sequences (Biologically Active Protein Sequences, BPS) to support downstream analyses.
27
 
28
+ Although the model was trained and validated on PTM and BPS datasets, `PDeepPP`’s architecture enables users to generalize and extend its capabilities to other protein sequence analysis tasks, such as embedding generation, sequence classification, or task-specific analyses.
29
 
30
+ ---
31
+
32
+ ### Key features
33
 
34
+ - **Dataset support**: `PDeepPP` is trained on PTM and BPS datasets, demonstrating its effectiveness in identifying specific sequence features (e.g., post-translational modification sites) and extracting biologically relevant regions.
35
+ - **Task flexibility**: The model is not limited to PTM and BPS tasks. Users can adapt `PDeepPP` to other protein sequence-based tasks by customizing input data and task objectives.
36
+ - **PTM mode**: Focuses on sequences centered around specific residues (S, T, Y) to analyze post-translational modification activity.
37
+ - **BPS mode**: Analyzes overlapping or non-overlapping subsequences of a protein to extract biologically meaningful features.
38
 
39
  ## How to use
40