hbp5181 commited on
Commit
6c020ed
·
verified ·
1 Parent(s): 0cbbf70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -13,24 +13,25 @@ tags:
13
  ---
14
 
15
 
 
16
  ESM2-GBT: Gradient Boosted Trees on ESM2 Embeddings
17
 
18
- 📌 Model Overview
19
  The ESM2-GBT model is a Gradient Boosted Trees (GBT) regressor trained on ESM2 embeddings from Meta’s ESM2 protein language model. It is designed for binding affinity predictive tasks.
20
 
21
 
22
- 🧪 Available Models:
23
 
24
  ACE2_RBD_ESM2-GBT.json
25
 
26
- Predicts binding affinity between ACE2 and RBD proteins.
27
 
28
  General_ESM2-GBT.json
29
 
30
  General-purpose GBT model trained on ESM2 embeddings.
31
 
32
 
33
- 🏗 Model Details
34
  • Base Model: ESM2
35
 
36
  • Architecture: Gradient Boosted Trees (CatBoostRegressor)
@@ -39,34 +40,34 @@ General-purpose GBT model trained on ESM2 embeddings.
39
 
40
  • Task: Regression
41
 
42
- 🧑‍💻 How to Use
43
 
44
- 1️⃣ Download Model from Hugging Face
45
  from huggingface_hub import hf_hub_download
46
 
47
  # Download ACE2 RBD model/General model
48
 
49
  model_path = hf_hub_download(repo_id="hbp5181/ESM2-GBT", filename="ACE2_RBD_ESM2-GBT.json")
50
 
51
- 2️⃣ Load Model in CatBoost
52
  from catboost import CatBoostRegressor
53
 
54
  model = CatBoostRegressor()
55
  model.load_model(model_path, format="json")
56
 
57
  # Predictions using your own dataset!
58
- 🔬 Training Details
59
 
60
  • Feature Extraction: ESM2 embeddings (33-layer transformer, 650M params)
61
  • Training Algorithm: CatBoost Gradient Boosting
62
  • Dataset: your own dataset
63
  • Evaluation Metrics: RMSE, R^2
64
 
65
- 📌 Applications
66
 
67
  • Binding affinity predictions
68
 
69
- 💡 Limitations & Considerations
70
 
71
  • The model is trained on ESM2 embeddings and is limited by the quality of those embeddings.
72
 
@@ -74,9 +75,10 @@ model.load_model(model_path, format="json")
74
 
75
  • Not a deep-learning model; instead, it leverages GBTs for fast, interpretable predictions.
76
 
77
- 📄 Citation
78
 
79
  👤 Maintainer: hbp5181@psu.edu
80
 
81
  📅 Last Updated: February 2025
82
 
 
 
13
  ---
14
 
15
 
16
+
17
  ESM2-GBT: Gradient Boosted Trees on ESM2 Embeddings
18
 
19
+ # Model Overview
20
  The ESM2-GBT model is a Gradient Boosted Trees (GBT) regressor trained on ESM2 embeddings from Meta’s ESM2 protein language model. It is designed for binding affinity predictive tasks.
21
 
22
 
23
+ # Available Models:
24
 
25
  ACE2_RBD_ESM2-GBT.json
26
 
27
+ Predicts binding affinity between ACE2 (human and animals) and RBD proteins.
28
 
29
  General_ESM2-GBT.json
30
 
31
  General-purpose GBT model trained on ESM2 embeddings.
32
 
33
 
34
+ # Model Details
35
  • Base Model: ESM2
36
 
37
  • Architecture: Gradient Boosted Trees (CatBoostRegressor)
 
40
 
41
  • Task: Regression
42
 
43
+ # How to Use
44
 
45
+ Download Model from Hugging Face
46
  from huggingface_hub import hf_hub_download
47
 
48
  # Download ACE2 RBD model/General model
49
 
50
  model_path = hf_hub_download(repo_id="hbp5181/ESM2-GBT", filename="ACE2_RBD_ESM2-GBT.json")
51
 
52
+ Load Model in CatBoost
53
  from catboost import CatBoostRegressor
54
 
55
  model = CatBoostRegressor()
56
  model.load_model(model_path, format="json")
57
 
58
  # Predictions using your own dataset!
59
+ Training Details
60
 
61
  • Feature Extraction: ESM2 embeddings (33-layer transformer, 650M params)
62
  • Training Algorithm: CatBoost Gradient Boosting
63
  • Dataset: your own dataset
64
  • Evaluation Metrics: RMSE, R^2
65
 
66
+ # Applications
67
 
68
  • Binding affinity predictions
69
 
70
+ Limitations & Considerations
71
 
72
  • The model is trained on ESM2 embeddings and is limited by the quality of those embeddings.
73
 
 
75
 
76
  • Not a deep-learning model; instead, it leverages GBTs for fast, interpretable predictions.
77
 
78
+ # Citation
79
 
80
  👤 Maintainer: hbp5181@psu.edu
81
 
82
  📅 Last Updated: February 2025
83
 
84
+