fr_childes_30

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.1358

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 30
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 40000
  • training_steps: 100000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 2.5 2000 6.5797
6.5322 5.0 4000 4.9212
6.5322 7.5 6000 4.4626
4.3003 10.0 8000 4.1895
4.3003 12.5 10000 4.0062
3.8146 15.0 12000 3.8694
3.8146 17.5 14000 3.7580
3.5419 20.0 16000 3.6665
3.5419 22.5 18000 3.5861
3.34 25.0 20000 3.5180
3.34 27.5 22000 3.4594
3.173 30.0 24000 3.4096
3.173 32.5 26000 3.3716
3.0285 35.0 28000 3.3410
3.0285 37.5 30000 3.3266
2.9006 40.0 32000 3.3082
2.9006 42.5 34000 3.3108
2.7827 45.0 36000 3.3069
2.7827 47.5 38000 3.3296
2.6704 50.0 40000 3.3364
2.6704 52.5 42000 3.3557
2.5513 55.0 44000 3.3838
2.5513 57.5 46000 3.4278
2.4284 60.0 48000 3.4365
2.4284 62.5 50000 3.4943
2.3193 65.0 52000 3.5162
2.3193 67.5 54000 3.5718
2.222 70.0 56000 3.5986
2.222 72.5 58000 3.6408
2.1359 75.0 60000 3.6674
2.1359 77.5 62000 3.7169
2.0591 80.0 64000 3.7447
2.0591 82.5 66000 3.7906
1.99 85.0 68000 3.8129
1.99 87.5 70000 3.8578
1.9296 90.0 72000 3.8741
1.9296 92.5 74000 3.9088
1.8742 95.0 76000 3.9329
1.8742 97.5 78000 3.9674
1.825 100.0 80000 3.9837
1.825 102.5 82000 4.0117
1.7804 105.0 84000 4.0291
1.7804 107.5 86000 4.0558
1.7408 110.0 88000 4.0708
1.7408 112.5 90000 4.0871
1.7057 115.0 92000 4.1012
1.7057 117.5 94000 4.1155
1.6754 120.0 96000 4.1225
1.6754 122.5 98000 4.1347
1.6504 125.0 100000 4.1358

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
2
Safetensors
Model size
12.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support