JuIm commited on
Commit
f7e6d72
·
verified ·
1 Parent(s): cc531e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -31
README.md CHANGED
@@ -12,37 +12,7 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # ProGemma
14
 
15
- This model is a fine-tuned version of [JuIm/ProGemma](https://huggingface.co/JuIm/ProGemma) on an unknown dataset.
16
-
17
- ## Model description
18
-
19
- More information needed
20
-
21
- ## Intended uses & limitations
22
-
23
- More information needed
24
-
25
- ## Training and evaluation data
26
-
27
- More information needed
28
-
29
- ## Training procedure
30
-
31
- ### Training hyperparameters
32
-
33
- The following hyperparameters were used during training:
34
- - learning_rate: 0.001
35
- - train_batch_size: 1
36
- - eval_batch_size: 8
37
- - seed: 42
38
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
39
- - lr_scheduler_type: linear
40
- - lr_scheduler_warmup_ratio: 0.4
41
- - training_steps: 7000
42
-
43
- ### Training results
44
-
45
-
46
 
47
  ### Framework versions
48
 
 
12
 
13
  # ProGemma
14
 
15
+ This is a custom configuration (275M parameters) of a Gemma 2 LLM that is being pre-trained on a corpus of amino acid sequences with the goal of generating de novo amino acid sequences in a zero-shot fashion. The model has currently been trained on 55% of the dataset as of 8.14.2024. A new version of the model will be continually pushed as training hits new checkpoints. Preliminary evaluation demonstrates perplexity scores, HHblit E-values, pLDDT, pTM, and ipTM scores on par with ProtGPT2. A full evaluation will be done upon completion of training. This model can be easily accessed using the transformers.pipeline function. For the tokenizer, use "JuIm/Amino-Acid-Sequence-Tokenizer".
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ### Framework versions
18