tiny_bert_bc_rand_5_v1

This model is a fine-tuned version of on the Hartunka/processed_book_corpus-rand-5 dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1055
  • Accuracy: 0.6822

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 96
  • eval_batch_size: 96
  • seed: 10
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Accuracy
7.2824 0.4215 10000 7.1207 0.1583
7.1705 0.8431 20000 6.9795 0.1726
4.6634 1.2646 30000 4.2812 0.5084
4.2844 1.6861 40000 3.9252 0.5556
4.0555 2.1077 50000 3.7210 0.5845
3.9068 2.5292 60000 3.5822 0.6047
3.8047 2.9507 70000 3.4940 0.6183
3.734 3.3723 80000 3.4309 0.6281
3.678 3.7938 90000 3.3727 0.6365
3.6347 4.2153 100000 3.3363 0.6426
3.6013 4.6369 110000 3.3058 0.6471
3.5724 5.0584 120000 3.2759 0.6515
3.5458 5.4799 130000 3.2562 0.6551
3.53 5.9014 140000 3.2334 0.6588
3.5054 6.3230 150000 3.2178 0.6610
3.4971 6.7445 160000 3.2043 0.6632
3.4724 7.1660 170000 3.1913 0.6659
3.4615 7.5876 180000 3.1805 0.6671
3.4516 8.0091 190000 3.1677 0.6690
3.4405 8.4306 200000 3.1585 0.6710
3.4284 8.8522 210000 3.1508 0.6722
3.4212 9.2737 220000 3.1426 0.6731
3.4078 9.6952 230000 3.1356 0.6746
3.3963 10.1168 240000 3.1310 0.6762
3.3982 10.5383 250000 3.1214 0.6771
3.3879 10.9598 260000 3.1157 0.6781
3.3778 11.3814 270000 3.1170 0.6780
3.3824 11.8029 280000 3.1097 0.6796
3.3619 12.2244 290000 3.1111 0.6801
3.3641 12.6460 300000 3.1050 0.6809
3.3471 13.0675 310000 3.1109 0.6812
3.3512 13.4890 320000 3.1056 0.6817
3.3523 13.9106 330000 3.1034 0.6820
3.3385 14.3321 340000 3.1033 0.6824
3.3393 14.7536 350000 3.1061 0.6828
3.3183 15.1751 360000 3.1164 0.6829
3.3261 15.5967 370000 3.1082 0.6833
3.2993 16.0182 380000 3.1230 0.6837
3.3079 16.4397 390000 3.1179 0.6836
3.3071 16.8613 400000 3.1099 0.6837
3.2867 17.2828 410000 3.1292 0.6843
3.2879 17.7043 420000 3.1274 0.6841
3.2591 18.1259 430000 3.1492 0.6841
3.265 18.5474 440000 3.1469 0.6839
3.2726 18.9689 450000 3.1432 0.6846
3.2429 19.3905 460000 3.1709 0.6847
3.2518 19.8120 470000 3.1598 0.6846
3.2136 20.2335 480000 3.1932 0.6846
3.2214 20.6551 490000 3.1829 0.6848
3.1855 21.0766 500000 3.1981 0.6848
3.1918 21.4981 510000 3.2118 0.6851
3.2026 21.9197 520000 3.1942 0.6851
3.1785 22.3412 530000 3.2237 0.6851
3.1744 22.7627 540000 3.2269 0.6853
3.1528 23.1843 550000 3.2419 0.6853
3.155 23.6058 560000 3.2405 0.6858
3.1321 24.0273 570000 3.2574 0.6859
3.1351 24.4488 580000 3.2498 0.6862
3.133 24.8704 590000 3.2519 0.6859

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.19.1
Downloads last month
-
Safetensors
Model size
33.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • Accuracy on Hartunka/processed_book_corpus-rand-5
    self-reported
    0.682