| license: mit |
|
|
| base_model: westlake-repl/SaProt_35M_AF2 |
| |
| Model Description |
| |
| This model is fine-tuned to predict the mutation effects of *Bacillus subtilis* α-amylase. It takes amino acid sequences as input and performs protein-level regression to predict quantitative enzyme activity values, enabling accurate assessment of how mutations alter enzyme function. |
| |
| Task type: protein-level regression |
| |
| Model input type: Amino acid sequence |
| |
| Dataset |
| |
| The dataset is sourced from van der Flier et al. (2024), available at https://www.sciencedirect.com/science/article/pii/S2001037024002940. It contains a total of 3706 rows, with each sample carrying 1 to 8 mutation sites. The full dataset is randomly split into training, validation, and test sets following an 8:1:1 ratio. The target label is absorbance, ranging from -0.001 to 0.211. A higher absorbance value indicates greater starch degradation, corresponding to stronger detergent activity of the amylase enzyme. |
| Performance (on test set) |
| |
| Spearman correlation: 0.76 |
| |
| LoRA config |
| |
| r: 8 |
| |
| lora_dropout: 0.1 |
|
|
| lora_alpha: 16 |
| |
| target_modules: ["query", "intermediate.dense", "value", "output.dense", "key"] |
|
|
| modules_to_save: ["classifier"] |
|
|
| Training config |
|
|
| optimizer: |
|
|
| class: AdamW |
|
|
| betas: (0.9, 0.98) |
|
|
| weight_decay: 0.01 |
| |
| learning rate: 0.0005 |
| |
| epoch: 30 |
| |
| batch size: 64 |
| |
| precision: 16-mixed |