KMB_SimCSE_test / README.md
CocoRoF's picture
test Done
8880359 verified
|
raw
history blame
10.4 kB
metadata
library_name: transformers
license: apache-2.0
base_model: x2bee/KoModernBERT-base-mlm-v03-retry-ckp02
tags:
  - generated_from_trainer
model-index:
  - name: KMB_SimCSE_test
    results: []

KMB_SimCSE_test

This model is a fine-tuned version of x2bee/KoModernBERT-base-mlm-v03-retry-ckp02 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0355
  • Pearson Cosine: 0.8274
  • Spearman Cosine: 0.8298
  • Pearson Manhattan: 0.8125
  • Spearman Manhattan: 0.8227
  • Pearson Euclidean: 0.8113
  • Spearman Euclidean: 0.8215
  • Pearson Dot: 0.7647
  • Spearman Dot: 0.7648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss Pearson Cosine Spearman Cosine Pearson Manhattan Spearman Manhattan Pearson Euclidean Spearman Euclidean Pearson Dot Spearman Dot
0.596 0.0469 100 0.0828 0.7829 0.7827 0.7914 0.7948 0.7911 0.7954 0.6840 0.6773
0.4303 0.0937 200 0.0730 0.7867 0.7920 0.7977 0.8026 0.7981 0.8036 0.7104 0.7050
0.409 0.1406 300 0.0649 0.8013 0.8024 0.8094 0.8136 0.8093 0.8141 0.7203 0.7128
0.3979 0.1874 400 0.0562 0.8114 0.8115 0.8150 0.8187 0.8144 0.8188 0.7383 0.7353
0.4072 0.2343 500 0.0635 0.8095 0.8150 0.8159 0.8223 0.8155 0.8221 0.7243 0.7169
0.3625 0.2812 600 0.0590 0.8091 0.8144 0.8111 0.8165 0.8107 0.8163 0.7392 0.7378
0.3796 0.3280 700 0.0638 0.8146 0.8208 0.8168 0.8234 0.8164 0.8232 0.7567 0.7549
0.3474 0.3749 800 0.0496 0.8119 0.8182 0.8252 0.8302 0.8249 0.8303 0.7310 0.7270
0.3159 0.4217 900 0.0567 0.8164 0.8209 0.8233 0.8286 0.8229 0.8285 0.7461 0.7445
0.3132 0.4686 1000 0.0541 0.8214 0.8266 0.8214 0.8282 0.8205 0.8277 0.7568 0.7562
0.3258 0.5155 1100 0.0605 0.8104 0.8165 0.8166 0.8232 0.8162 0.8231 0.7357 0.7310
0.3566 0.5623 1200 0.0541 0.8126 0.8195 0.8142 0.8205 0.8132 0.8195 0.7469 0.7424
0.2999 0.6092 1300 0.0474 0.8244 0.8290 0.8228 0.8289 0.8216 0.8284 0.7661 0.7629
0.2793 0.6560 1400 0.0471 0.8212 0.8265 0.8201 0.8264 0.8187 0.8256 0.7625 0.7615
0.3287 0.7029 1500 0.0523 0.8238 0.8296 0.8193 0.8276 0.8181 0.8266 0.7419 0.7435
0.3227 0.7498 1600 0.0504 0.8223 0.8279 0.8180 0.8252 0.8172 0.8244 0.7568 0.7556
0.3217 0.7966 1700 0.0516 0.8194 0.8249 0.8182 0.8243 0.8169 0.8233 0.7497 0.7466
0.2344 0.8435 1800 0.0449 0.8292 0.8331 0.8188 0.8258 0.8174 0.8244 0.7711 0.7723
0.2974 0.8903 1900 0.0502 0.8223 0.8270 0.8133 0.8208 0.8125 0.8199 0.7658 0.7662
0.3285 0.9372 2000 0.0574 0.8144 0.8209 0.8112 0.8191 0.8105 0.8178 0.7339 0.7302
0.2791 0.9841 2100 0.0479 0.8211 0.8250 0.8175 0.8237 0.8165 0.8229 0.7503 0.7507
0.1703 1.0309 2200 0.0359 0.8254 0.8256 0.8156 0.8203 0.8143 0.8194 0.7736 0.7731
0.1991 1.0778 2300 0.0362 0.8266 0.8265 0.8119 0.8186 0.8107 0.8177 0.7657 0.7682
0.2088 1.1246 2400 0.0379 0.8224 0.8243 0.8158 0.8232 0.8148 0.8222 0.7539 0.7536
0.2007 1.1715 2500 0.0336 0.8289 0.8304 0.8124 0.8206 0.8108 0.8195 0.7759 0.7778
0.1828 1.2184 2600 0.0356 0.8246 0.8266 0.8162 0.8217 0.8154 0.8215 0.7684 0.7674
0.2069 1.2652 2700 0.0368 0.8171 0.8196 0.8128 0.8187 0.8116 0.8179 0.7549 0.7544
0.1957 1.3121 2800 0.0398 0.8185 0.8216 0.8168 0.8240 0.8160 0.8234 0.7474 0.7459
0.1917 1.3590 2900 0.0355 0.8240 0.8256 0.8125 0.8199 0.8108 0.8186 0.7592 0.7607
0.1944 1.4058 3000 0.0355 0.8271 0.8292 0.8163 0.8243 0.8148 0.8230 0.7621 0.7643
0.1777 1.4527 3100 0.0360 0.8219 0.8227 0.8169 0.8232 0.8154 0.8221 0.7545 0.7557
0.1816 1.4995 3200 0.0364 0.8213 0.8228 0.8185 0.8247 0.8169 0.8237 0.7616 0.7590
0.229 1.5464 3300 0.0396 0.8169 0.8199 0.8177 0.8241 0.8165 0.8235 0.7529 0.7498
0.1742 1.5933 3400 0.0345 0.8245 0.8252 0.8185 0.8253 0.8169 0.8243 0.7647 0.7634
0.1606 1.6401 3500 0.0345 0.8219 0.8230 0.8146 0.8223 0.8128 0.8213 0.7629 0.7622
0.1982 1.6870 3600 0.0380 0.8220 0.8233 0.8196 0.8257 0.8182 0.8249 0.7552 0.7535
0.1824 1.7338 3700 0.0352 0.8246 0.8252 0.8181 0.8242 0.8166 0.8233 0.7567 0.7554
0.2009 1.7807 3800 0.0358 0.8270 0.8278 0.8105 0.8181 0.8090 0.8164 0.7669 0.7655
0.1899 1.8276 3900 0.0385 0.8240 0.8252 0.8133 0.8202 0.8111 0.8180 0.7418 0.7383
0.1858 1.8744 4000 0.0337 0.8281 0.8274 0.8122 0.8198 0.8102 0.8180 0.7620 0.7590
0.1679 1.9213 4100 0.0349 0.8238 0.8249 0.8109 0.8200 0.8097 0.8187 0.7561 0.7551
0.1699 1.9681 4200 0.0355 0.8274 0.8298 0.8125 0.8227 0.8113 0.8215 0.7647 0.7648

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0