JuIm
/

ProGemma

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

JuIm commited on Aug 14, 2024

Commit

f7e6d72

·

verified ·

1 Parent(s): cc531e9

Update README.md

Files changed (1) hide show

README.md +1 -31

README.md CHANGED Viewed

@@ -12,37 +12,7 @@ should probably proofread and complete it, then remove this comment. -->
 # ProGemma
-This model is a fine-tuned version of [JuIm/ProGemma](https://huggingface.co/JuIm/ProGemma) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.001
-- train_batch_size: 1
-- eval_batch_size: 8
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.4
-- training_steps: 7000
-### Training results
 ### Framework versions

 # ProGemma
+This is a custom configuration (275M parameters) of a Gemma 2 LLM that is being pre-trained on a corpus of amino acid sequences with the goal of generating de novo amino acid sequences in a zero-shot fashion. The model has currently been trained on 55% of the dataset as of 8.14.2024. A new version of the model will be continually pushed as training hits new checkpoints. Preliminary evaluation demonstrates perplexity scores, HHblit E-values, pLDDT, pTM, and ipTM scores on par with ProtGPT2. A full evaluation will be done upon completion of training. This model can be easily accessed using the transformers.pipeline function. For the tokenizer, use "JuIm/Amino-Acid-Sequence-Tokenizer".
 ### Framework versions