Loganz97 commited on
Commit
660f0ac
·
verified ·
1 Parent(s): 7de901b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,3 +1,20 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ ### Model Overview
5
+
6
+ The **OphPred** model is a machine learning-based tool developed to predict the optimal pH of enzyme activity directly from protein sequences. Utilizing the ESM-2 protein language model combined with KNN (k-nearest neighbors) and XGBoost algorithms, OphPred provides robust and reliable predictions across various enzyme classes. The model has been rigorously validated using different train-validation splitting strategies, including random, homology-based, PFAM-based, and EC-based splits. OphPred is designed to be fast and efficient, making it suitable for high-throughput screening of large protein libraries.
7
+
8
+ ### Key Features:
9
+ - **Input**: Protein sequences.
10
+ - **Output**: Predicted optimal pH range for enzyme activity.
11
+ - **Performance**: Demonstrated strong predictive accuracy with a mean absolute error (MAE) as low as 0.6 and Spearman correlation up to 0.77 when enriched with additional data.
12
+ - **Use Cases**: Useful for protein engineering, enzyme optimization in biotechnology, and exploring protein space for desired enzymatic properties.
13
+
14
+ ### Citation
15
+ If you use this model, please cite the authors as follows:
16
+
17
+ Zaretckii, M.; Buslaev, P.; Kozlovskii, I.; Morozov, A.; Popov, P. Approaching Optimal pH Enzyme Prediction with Large Language Models. *ACS Synth. Biol.* **2024,** *10*, DOI: 10.1021/acssynbio.4c00465.
18
+
19
+ ### Further Reading
20
+ You can read the full paper describing the development and validation of OphPred at this [link](https://doi.org/10.1021/acssynbio.4c00465).