license: mit base_model: westlake-repl/SaProt_35M_AF2 Model Description This model is fine-tuned to predict the mutation effects of *Bacillus subtilis* α-amylase. It takes amino acid sequences as input and performs protein-level regression to predict quantitative enzyme activity values, enabling accurate assessment of how mutations alter enzyme function. Task type: protein-level regression Model input type: Amino acid sequence Dataset The dataset is sourced from van der Flier et al. (2024), available at https://www.sciencedirect.com/science/article/pii/S2001037024002940. It contains a total of 3706 rows, with each sample carrying 1 to 8 mutation sites. The full dataset is randomly split into training, validation, and test sets following an 8:1:1 ratio. The target label is absorbance, ranging from -0.001 to 0.211. A higher absorbance value indicates greater starch degradation, corresponding to stronger detergent activity of the amylase enzyme. Performance (on test set) Spearman correlation: 0.76 LoRA config r: 8 lora_dropout: 0.1 lora_alpha: 16 target_modules: ["query", "intermediate.dense", "value", "output.dense", "key"] modules_to_save: ["classifier"] Training config optimizer: class: AdamW betas: (0.9, 0.98) weight_decay: 0.01 learning rate: 0.0005 epoch: 30 batch size: 64 precision: 16-mixed