JuIm commited on
Commit
b2976d4
·
verified ·
1 Parent(s): 447fa8c

End of training

Browse files
README.md CHANGED
@@ -12,29 +12,37 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # ProGemma2
14
 
15
- This is a custom configuration of Google’s Gemma 2 LLM (335M parameters) that is being pre-trained on amino acid sequences of 512 AA or less in length. Periodic updates are made to this page as training reaches new checkpoints.
16
 
17
- The purpose of this model was to investigate the differences between ProGemma and ProtGPT (GPT-2 architecture) as it pertains to sequence generation.
18
 
19
- Controlled generation is not a capability of this model, and therefore serves as a method to significantly improve generation as, in principal, a sequence that performs a given function or resides in a particular cellular location can be generated.
20
 
21
- In sequence generation, a top_k of 950 appears to work well as it prevents repetition. This is also seen in ProtGPT.
22
 
23
- Below is code using the Transformers library to generate sequences using ProGemma.
24
 
25
- from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
26
 
27
- model = AutoModelForCausalLM.from_pretrained("JuIm/ProGemma")
28
 
29
- tokenizer = AutoTokenizer.from_pretrained("JuIm/Amino-Acid-Sequence-Tokenizer")
30
 
31
- progemma = pipeline("text-generation", model=model, tokenizer=tokenizer)
32
 
33
- sequence = progemma("\<bos>", top_k=950, max_length=100, num_return_sequences=1, do_sample=True, repetition_penalty=1.2, eos_token_id=21, pad_token_id=22, bos_token_id=20)
 
 
 
 
 
 
 
 
 
 
34
 
35
- s = sequence[0]['generated_text']
36
 
37
- print(s)
38
 
39
  ### Framework versions
40
 
 
12
 
13
  # ProGemma2
14
 
15
+ This model is a fine-tuned version of [JuIm/ProGemma2](https://huggingface.co/JuIm/ProGemma2) on an unknown dataset.
16
 
17
+ ## Model description
18
 
19
+ More information needed
20
 
21
+ ## Intended uses & limitations
22
 
23
+ More information needed
24
 
25
+ ## Training and evaluation data
26
 
27
+ More information needed
28
 
29
+ ## Training procedure
30
 
31
+ ### Training hyperparameters
32
 
33
+ The following hyperparameters were used during training:
34
+ - learning_rate: 0.001
35
+ - train_batch_size: 2
36
+ - eval_batch_size: 8
37
+ - seed: 42
38
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
39
+ - lr_scheduler_type: linear
40
+ - lr_scheduler_warmup_ratio: 0.4
41
+ - training_steps: 3500
42
+
43
+ ### Training results
44
 
 
45
 
 
46
 
47
  ### Framework versions
48
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fbb3f41d3d48678d86d8a32b153009d262228a0b5979137841eb1f3b6d372e5e
3
  size 1342562152
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:524819aa1c4139d2aef89e1a15459a7fd5a19fba72abd7151aed2ef75ea93b49
3
  size 1342562152
runs/Aug30_17-36-11_627593dcc452/events.out.tfevents.1725039375.627593dcc452.214.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06c57b4fe3aa29c3b8646ee9610235a1a0a8eaeaeb8bfc7b670d2ee72481906c
3
+ size 743330
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c255ba09d53656284876d42a0b5f56a46cea78f516624bc40992b52c50d49f78
3
  size 5112
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f82693a5bddefae6d93466903dd3b93b7a2eae558a42057b5c4581f02fb4525a
3
  size 5112