FarmerTao's picture
Update README.md
3c4ef2d verified
---
base_model: westlake-repl/SaProt_35M_AF2
library_name: peft
---
# Base model: [westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This model is trained on a sigle site deep mutation scanning dataset and
can be used to predict fitness score of mutant amino acid sequence of protein [RASH_HUMAN](https://www.uniprot.org/uniprotkb/P01112/entry) (GTPase HRas).
## Protein Function
This protein involved in the activation of Ras protein signal transduction.
Ras proteins bind GDP/GTP and possess intrinsic GTPase activity.
### Task type
protein level regression
### Dataset description
The dataset is from [Deep generative models of genetic variation capture the effects of mutations](https://www.nature.com/articles/s41592-018-0138-4).
And can also be found on [SaprotHub dataset](https://huggingface.co/datasets/SaProtHub/DMS_RASH_HUMAN).
Label means fitness score of each mutant amino acid sequence, ranging from minus infinity to positive infinity where 0 is value of wildtype,
larger means higher fitness.
### Model input type
Amino acid sequence
### Performance
0.64 Spearman's ρ
### LoRA config
lora_dropout: 0.0
lora_alpha: 16
target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]
modules_to_save: ["classifier"]
### Training config
class: AdamW
betas: (0.9, 0.98)
weight_decay: 0.01
learning rate: 1e-4
epoch: 50
batch size: 256
precision: 16-mixed