Jasonnn13's picture
End of training
36d744b verified
metadata
library_name: transformers
license: apache-2.0
base_model: HuggingFaceTB/SmolLM2-135M
tags:
  - smol-course
  - module_1
  - trl
  - sft
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: SmolLM2-FT-MyDataset
    results: []

SmolLM2-FT-MyDataset

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9266

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 1000

Training results

Training Loss Epoch Step Validation Loss
1.0161 0.0982 50 1.0429
0.9943 0.1965 100 1.0086
0.953 0.2947 150 0.9906
0.9561 0.3929 200 0.9802
0.9585 0.4912 250 0.9699
0.9394 0.5894 300 0.9597
0.9284 0.6876 350 0.9535
0.9237 0.7859 400 0.9462
0.8945 0.8841 450 0.9402
0.9043 0.9823 500 0.9348
0.7479 1.0806 550 0.9421
0.805 1.1788 600 0.9388
0.7359 1.2770 650 0.9382
0.7437 1.3752 700 0.9365
0.7778 1.4735 750 0.9336
0.7678 1.5717 800 0.9317
0.7452 1.6699 850 0.9301
0.7373 1.7682 900 0.9279
0.738 1.8664 950 0.9269
0.7513 1.9646 1000 0.9266

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.19.1