koelectra-small-klue-mrc

This model is a fine-tuned version of monologg/koelectra-small-discriminator on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.8862

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 10 5.1798
No log 2.0 20 5.1096
No log 3.0 30 5.0458
No log 4.0 40 4.9836
No log 5.0 50 4.9277
No log 6.0 60 4.8793
No log 7.0 70 4.8308
No log 8.0 80 4.7832
No log 9.0 90 4.7425
No log 10.0 100 4.7286
No log 11.0 110 4.7068
No log 12.0 120 4.6946
No log 13.0 130 4.7491
No log 14.0 140 4.7647
No log 15.0 150 4.8638
No log 16.0 160 4.8820
No log 17.0 170 5.0045
No log 18.0 180 5.0747
No log 19.0 190 5.2426
No log 20.0 200 5.1869
No log 21.0 210 5.2337
No log 22.0 220 5.2947
No log 23.0 230 5.3827
No log 24.0 240 5.5095
No log 25.0 250 5.5749
No log 26.0 260 5.5239
No log 27.0 270 5.5983
No log 28.0 280 5.6924
No log 29.0 290 5.7221
No log 30.0 300 5.7887
No log 31.0 310 5.7415
No log 32.0 320 5.8169
No log 33.0 330 5.8756
No log 34.0 340 5.9136
No log 35.0 350 5.8803
No log 36.0 360 5.9336
No log 37.0 370 5.8968
No log 38.0 380 5.9618
No log 39.0 390 5.9325
No log 40.0 400 5.9454
No log 41.0 410 5.9459
No log 42.0 420 6.0392
No log 43.0 430 6.0107
No log 44.0 440 6.0393
No log 45.0 450 5.9902
No log 46.0 460 6.0447
No log 47.0 470 6.0219
No log 48.0 480 6.0782
No log 49.0 490 6.0608
2.724 50.0 500 6.0360
2.724 51.0 510 6.1162
2.724 52.0 520 6.0687
2.724 53.0 530 6.0891
2.724 54.0 540 6.0275
2.724 55.0 550 6.1184
2.724 56.0 560 6.1191
2.724 57.0 570 6.0886
2.724 58.0 580 6.1525
2.724 59.0 590 6.0594
2.724 60.0 600 6.1628
2.724 61.0 610 6.1402
2.724 62.0 620 6.1084
2.724 63.0 630 6.1631
2.724 64.0 640 6.1453
2.724 65.0 650 6.0526
2.724 66.0 660 6.1930
2.724 67.0 670 6.1401
2.724 68.0 680 6.1311
2.724 69.0 690 6.1469
2.724 70.0 700 6.1566
2.724 71.0 710 6.1668
2.724 72.0 720 6.1641
2.724 73.0 730 6.2553
2.724 74.0 740 6.2870
2.724 75.0 750 6.1782
2.724 76.0 760 6.2170
2.724 77.0 770 6.2451
2.724 78.0 780 6.1456
2.724 79.0 790 6.4069
2.724 80.0 800 6.1756
2.724 81.0 810 6.3466
2.724 82.0 820 6.2901
2.724 83.0 830 6.3088
2.724 84.0 840 6.3833
2.724 85.0 850 6.2229
2.724 86.0 860 6.2957
2.724 87.0 870 6.4170
2.724 88.0 880 6.3442
2.724 89.0 890 6.3048
2.724 90.0 900 6.3033
2.724 91.0 910 6.4667
2.724 92.0 920 6.3638
2.724 93.0 930 6.3499
2.724 94.0 940 6.4794
2.724 95.0 950 6.5237
2.724 96.0 960 6.4191
2.724 97.0 970 6.4382
2.724 98.0 980 6.5362
2.724 99.0 990 6.5435
0.97 100.0 1000 6.5632
0.97 101.0 1010 6.4792
0.97 102.0 1020 6.5099
0.97 103.0 1030 6.4869
0.97 104.0 1040 6.4560
0.97 105.0 1050 6.5565
0.97 106.0 1060 6.5755
0.97 107.0 1070 6.5358
0.97 108.0 1080 6.5162
0.97 109.0 1090 6.5295
0.97 110.0 1100 6.5194
0.97 111.0 1110 6.5133
0.97 112.0 1120 6.5235
0.97 113.0 1130 6.5547
0.97 114.0 1140 6.6247
0.97 115.0 1150 6.6352
0.97 116.0 1160 6.7224
0.97 117.0 1170 6.6779
0.97 118.0 1180 6.7190
0.97 119.0 1190 6.6232
0.97 120.0 1200 6.6254
0.97 121.0 1210 6.6257
0.97 122.0 1220 6.5869
0.97 123.0 1230 6.6805
0.97 124.0 1240 6.7540
0.97 125.0 1250 6.7129
0.97 126.0 1260 6.7146
0.97 127.0 1270 6.7396
0.97 128.0 1280 6.5899
0.97 129.0 1290 6.6859
0.97 130.0 1300 6.7992
0.97 131.0 1310 6.7338
0.97 132.0 1320 6.7206
0.97 133.0 1330 6.6792
0.97 134.0 1340 6.6909
0.97 135.0 1350 6.7726
0.97 136.0 1360 6.8457
0.97 137.0 1370 6.8488
0.97 138.0 1380 6.7220
0.97 139.0 1390 6.6795
0.97 140.0 1400 6.7525
0.97 141.0 1410 6.8977
0.97 142.0 1420 6.8635
0.97 143.0 1430 6.8075
0.97 144.0 1440 6.7010
0.97 145.0 1450 6.7365
0.97 146.0 1460 6.7015
0.97 147.0 1470 6.8077
0.97 148.0 1480 6.7510
0.97 149.0 1490 6.6576
0.5908 150.0 1500 6.6435
0.5908 151.0 1510 6.6683
0.5908 152.0 1520 6.8315
0.5908 153.0 1530 6.8130
0.5908 154.0 1540 6.7403
0.5908 155.0 1550 6.6926
0.5908 156.0 1560 6.7279
0.5908 157.0 1570 6.7882
0.5908 158.0 1580 6.8302
0.5908 159.0 1590 6.7385
0.5908 160.0 1600 6.7678
0.5908 161.0 1610 6.7250
0.5908 162.0 1620 6.7033
0.5908 163.0 1630 6.7048
0.5908 164.0 1640 6.7476
0.5908 165.0 1650 6.7287
0.5908 166.0 1660 6.7813
0.5908 167.0 1670 6.8700
0.5908 168.0 1680 6.9388
0.5908 169.0 1690 6.9385
0.5908 170.0 1700 6.9081
0.5908 171.0 1710 6.8223
0.5908 172.0 1720 6.7192
0.5908 173.0 1730 6.7037
0.5908 174.0 1740 6.7607
0.5908 175.0 1750 6.8238
0.5908 176.0 1760 6.8259
0.5908 177.0 1770 6.8644
0.5908 178.0 1780 6.8524
0.5908 179.0 1790 6.8385
0.5908 180.0 1800 6.8229
0.5908 181.0 1810 6.7635
0.5908 182.0 1820 6.7910
0.5908 183.0 1830 6.8497
0.5908 184.0 1840 6.8718
0.5908 185.0 1850 6.8951
0.5908 186.0 1860 6.8877
0.5908 187.0 1870 6.9173
0.5908 188.0 1880 6.9286
0.5908 189.0 1890 6.9086
0.5908 190.0 1900 6.9076
0.5908 191.0 1910 6.9121
0.5908 192.0 1920 6.8990
0.5908 193.0 1930 6.9119
0.5908 194.0 1940 6.9147
0.5908 195.0 1950 6.9102
0.5908 196.0 1960 6.9094
0.5908 197.0 1970 6.8977
0.5908 198.0 1980 6.8941
0.5908 199.0 1990 6.8879
0.4772 200.0 2000 6.8862

Framework versions

  • Transformers 4.53.2
  • Pytorch 2.6.0+cu124
  • Datasets 4.0.0
  • Tokenizers 0.21.2
Downloads last month
1
Safetensors
Model size
13.7M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Prizerl/koelectra-small-klue-mrc

Finetuned
(33)
this model