SKN24_3rd_2Team

This model is a fine-tuned version of google/gemma-3-12b-it on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0774

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.03
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
1.2075 0.1752 300 1.3187
1.3937 0.3504 600 1.2513
1.2273 0.5257 900 1.1996
1.2854 0.7009 1200 1.1728
1.1348 0.8761 1500 1.1459
0.9602 1.0508 1800 1.1295
1.0161 1.2260 2100 1.1074
0.9347 1.4013 2400 1.0975
0.8680 1.5765 2700 1.0838
0.8655 1.7517 3000 1.0784
0.8469 1.9269 3300 1.0778
0.8745 2.0 3426 1.0774

Framework versions

  • PEFT 0.18.1
  • Transformers 5.5.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.8.4
  • Tokenizers 0.22.2
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for YHPark0208/SKN24_3rd_2Team

Adapter
(368)
this model