Gemma-2b-MultiCap / README.md
sofyc's picture
commit
1c7f25b verified
metadata
base_model: google/gemma-2-2b-it
library_name: peft
license: gemma
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: Gemma-2b-MultiCap
    results: []

Gemma-2b-MultiCap

This model is a fine-tuned version of google/gemma-2-2b-it on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5983

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 600

Training results

Training Loss Epoch Step Validation Loss
0.8045 0.0564 50 0.8067
0.7271 0.1128 100 0.6777
0.688 0.1692 150 0.6309
0.6268 0.2256 200 0.6176
0.572 0.2820 250 0.6118
0.5864 0.3384 300 0.6065
0.5528 0.3948 350 0.6030
0.5396 0.4512 400 0.6015
0.5726 0.5076 450 0.6005
0.5655 0.5640 500 0.5997
0.5712 0.6204 550 0.5988
0.5213 0.6768 600 0.5983

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.4.0+cu124
  • Datasets 2.21.0
  • Tokenizers 0.19.1