huangruihua commited on
Commit
307c191
·
verified ·
1 Parent(s): 5da7d43

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -9,16 +9,20 @@ tags:
9
  library_name: pytorch
10
  ---
11
 
12
- # EMFP: ESM-2 Micropeptide Functional Predictor
13
 
14
- Deep learning model for predicting functional micropeptides. Fine-tuned ESM-2 (650M).
 
 
15
 
16
  ## Performance
17
 
18
  | Task | EMFP | Random Forest | ESM+MLP | ProtBERT+MLP |
19
  |------|------|---------------|---------|--------------|
20
  | Authenticity | **0.967** | 0.718 | 0.892 | 0.856 |
21
- | Functionality | **0.932** | 0.505 | 0.827 | 0.791 |
 
 
22
 
23
  ## Usage
24
 
@@ -57,14 +61,14 @@ labels, strs, tokens = batch_converter(data)
57
  with torch.no_grad():
58
  logits = classifier(tokens)
59
  probs = torch.softmax(logits, dim=1)
60
- print(f"Functional: {probs[0, 1].item():.4f}")
61
  ```
62
 
63
  ## Model Details
64
 
65
- - Base: ESM-2 650M
66
- - Training: 26,626 sequences
67
- - Optimizer: AdamW, FP16
68
  - Size: 7.4 GB
69
 
70
  ## Download
@@ -75,13 +79,13 @@ huggingface-cli download huangruihua/EMFP best_model.pt --local-dir ./
75
 
76
  ## GitHub
77
 
78
- Full code: https://github.com/huangruihua/EMFP
79
 
80
  ## Citation
81
 
82
  ```bibtex
83
  @software{emfp_2026,
84
- title={EMFP: ESM-2 Micropeptide Functional Predictor},
85
  author={Huang, Rui-Hua},
86
  year={2026},
87
  url={https://github.com/huangruihua/EMFP}
 
9
  library_name: pytorch
10
  ---
11
 
12
+ # EMFP: ESM-2 Micropeptide Predictor for Canonical Functional Proteins
13
 
14
+ EMFP is designed to identify peptide sequences that may encode canonical functional proteins, as defined by molecular function annotations in the UniProt database. While many peptides can be bioactive, EMFP specifically focuses on distinguishing peptides with protein-like molecular functions from those with other or unknown mechanisms of action.
15
+
16
+ Fine-tuned ESM-2 (650M) model for predicting peptides encoding canonical functional proteins.
17
 
18
  ## Performance
19
 
20
  | Task | EMFP | Random Forest | ESM+MLP | ProtBERT+MLP |
21
  |------|------|---------------|---------|--------------|
22
  | Authenticity | **0.967** | 0.718 | 0.892 | 0.856 |
23
+ | Canonical Protein Function | **0.932** | 0.505 | 0.827 | 0.791 |
24
+
25
+ **Note**: "Canonical Protein Function" refers to peptides encoding proteins with molecular function annotations (enzyme activity, binding activity, etc.) as defined in UniProt.
26
 
27
  ## Usage
28
 
 
61
  with torch.no_grad():
62
  logits = classifier(tokens)
63
  probs = torch.softmax(logits, dim=1)
64
+ print(f"Probability of encoding canonical functional protein: {probs[0, 1].item():.4f}")
65
  ```
66
 
67
  ## Model Details
68
 
69
+ - Base: ESM-2 650M (`esm2_t33_650M_UR50D`)
70
+ - Training: 26,626 peptide sequences from UniProt with molecular function annotations
71
+ - Optimizer: AdamW, FP16 precision
72
  - Size: 7.4 GB
73
 
74
  ## Download
 
79
 
80
  ## GitHub
81
 
82
+ Full code: [https://github.com/huangruihua/EMFP](https://github.com/huangruihua/EMFP)
83
 
84
  ## Citation
85
 
86
  ```bibtex
87
  @software{emfp_2026,
88
+ title={EMFP: ESM-2 Micropeptide Predictor for Canonical Functional Proteins},
89
  author={Huang, Rui-Hua},
90
  year={2026},
91
  url={https://github.com/huangruihua/EMFP}