File size: 18,910 Bytes
9ad72d9
 
 
47e083a
9ad72d9
 
 
47e083a
9ad72d9
 
 
 
 
 
47e083a
9ad72d9
47e083a
9ad72d9
47e083a
 
 
 
 
 
 
 
 
9ad72d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47e083a
9ad72d9
47e083a
9ad72d9
 
 
 
 
 
 
47e083a
9ad72d9
 
 
47e083a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9ad72d9
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
---
library_name: transformers
license: apache-2.0
base_model: x2bee/KoModernBERT-base-mlm
tags:
- generated_from_trainer
model-index:
- name: KMB_SimCSE
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# KMB_SimCSE

This model is a fine-tuned version of [x2bee/KoModernBERT-base-mlm](https://huggingface.co/x2bee/KoModernBERT-base-mlm) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0387
- Pearson Cosine: 0.7824
- Spearman Cosine: 0.7845
- Pearson Manhattan: 0.7335
- Spearman Manhattan: 0.7460
- Pearson Euclidean: 0.7337
- Spearman Euclidean: 0.7463
- Pearson Dot: 0.6362
- Spearman Dot: 0.6532

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10.0

### Training results

| Training Loss | Epoch  | Step  | Validation Loss | Pearson Cosine | Spearman Cosine | Pearson Manhattan | Spearman Manhattan | Pearson Euclidean | Spearman Euclidean | Pearson Dot | Spearman Dot |
|:-------------:|:------:|:-----:|:---------------:|:--------------:|:---------------:|:-----------------:|:------------------:|:-----------------:|:------------------:|:-----------:|:------------:|
| 1.0084        | 0.1172 | 250   | 0.1579          | 0.6838         | 0.6994          | 0.6615            | 0.6693             | 0.6621            | 0.6694             | 0.3480      | 0.3442       |
| 0.7072        | 0.2343 | 500   | 0.1364          | 0.7226         | 0.7375          | 0.7207            | 0.7263             | 0.7214            | 0.7271             | 0.4002      | 0.3910       |
| 0.6207        | 0.3515 | 750   | 0.1194          | 0.7371         | 0.7509          | 0.7295            | 0.7398             | 0.7300            | 0.7401             | 0.4517      | 0.4462       |
| 0.5767        | 0.4686 | 1000  | 0.1147          | 0.7508         | 0.7636          | 0.7395            | 0.7502             | 0.7400            | 0.7511             | 0.5170      | 0.5181       |
| 0.5026        | 0.5858 | 1250  | 0.1047          | 0.7507         | 0.7635          | 0.7455            | 0.7558             | 0.7459            | 0.7564             | 0.5487      | 0.5531       |
| 0.5192        | 0.7029 | 1500  | 0.1166          | 0.7522         | 0.7673          | 0.7487            | 0.7591             | 0.7489            | 0.7594             | 0.5055      | 0.5053       |
| 0.5046        | 0.8201 | 1750  | 0.1110          | 0.7555         | 0.7675          | 0.7582            | 0.7675             | 0.7581            | 0.7672             | 0.5303      | 0.5391       |
| 0.5055        | 0.9372 | 2000  | 0.1062          | 0.7546         | 0.7726          | 0.7501            | 0.7650             | 0.7502            | 0.7651             | 0.5638      | 0.5710       |
| 0.4177        | 1.0544 | 2250  | 0.0942          | 0.7577         | 0.7709          | 0.7511            | 0.7635             | 0.7510            | 0.7633             | 0.5577      | 0.5635       |
| 0.4136        | 1.1715 | 2500  | 0.0915          | 0.7612         | 0.7727          | 0.7584            | 0.7696             | 0.7586            | 0.7696             | 0.5554      | 0.5595       |
| 0.4425        | 1.2887 | 2750  | 0.0928          | 0.7605         | 0.7726          | 0.7461            | 0.7591             | 0.7463            | 0.7592             | 0.5498      | 0.5512       |
| 0.3708        | 1.4058 | 3000  | 0.0819          | 0.7670         | 0.7783          | 0.7478            | 0.7634             | 0.7481            | 0.7637             | 0.5834      | 0.5847       |
| 0.3934        | 1.5230 | 3250  | 0.0848          | 0.7709         | 0.7814          | 0.7539            | 0.7692             | 0.7542            | 0.7689             | 0.5655      | 0.5668       |
| 0.3203        | 1.6401 | 3500  | 0.0781          | 0.7706         | 0.7810          | 0.7529            | 0.7689             | 0.7531            | 0.7691             | 0.5871      | 0.5891       |
| 0.4052        | 1.7573 | 3750  | 0.0824          | 0.7705         | 0.7816          | 0.7628            | 0.7771             | 0.7628            | 0.7771             | 0.5909      | 0.5989       |
| 0.3723        | 1.8744 | 4000  | 0.0819          | 0.7720         | 0.7840          | 0.7515            | 0.7679             | 0.7520            | 0.7685             | 0.5711      | 0.5713       |
| 0.3645        | 1.9916 | 4250  | 0.0802          | 0.7676         | 0.7804          | 0.7560            | 0.7704             | 0.7560            | 0.7703             | 0.5685      | 0.5701       |
| 0.3007        | 2.1087 | 4500  | 0.0662          | 0.7682         | 0.7799          | 0.7572            | 0.7721             | 0.7574            | 0.7721             | 0.5973      | 0.5981       |
| 0.2397        | 2.2259 | 4750  | 0.0617          | 0.7693         | 0.7782          | 0.7501            | 0.7655             | 0.7502            | 0.7652             | 0.5855      | 0.5898       |
| 0.28          | 2.3430 | 5000  | 0.0645          | 0.7654         | 0.7760          | 0.7567            | 0.7705             | 0.7569            | 0.7705             | 0.5925      | 0.5970       |
| 0.2631        | 2.4602 | 5250  | 0.0639          | 0.7712         | 0.7798          | 0.7561            | 0.7705             | 0.7562            | 0.7705             | 0.5715      | 0.5731       |
| 0.2488        | 2.5773 | 5500  | 0.0636          | 0.7736         | 0.7838          | 0.7537            | 0.7687             | 0.7538            | 0.7685             | 0.5835      | 0.5861       |
| 0.2557        | 2.6945 | 5750  | 0.0614          | 0.7739         | 0.7830          | 0.7570            | 0.7716             | 0.7571            | 0.7717             | 0.6008      | 0.6041       |
| 0.2699        | 2.8116 | 6000  | 0.0636          | 0.7722         | 0.7795          | 0.7570            | 0.7699             | 0.7572            | 0.7701             | 0.5844      | 0.5864       |
| 0.2794        | 2.9288 | 6250  | 0.0639          | 0.7704         | 0.7800          | 0.7582            | 0.7745             | 0.7581            | 0.7746             | 0.5817      | 0.5793       |
| 0.1778        | 3.0459 | 6500  | 0.0526          | 0.7738         | 0.7811          | 0.7574            | 0.7739             | 0.7573            | 0.7739             | 0.6193      | 0.6255       |
| 0.1791        | 3.1631 | 6750  | 0.0519          | 0.7728         | 0.7783          | 0.7540            | 0.7704             | 0.7538            | 0.7700             | 0.6116      | 0.6182       |
| 0.201         | 3.2802 | 7000  | 0.0511          | 0.7755         | 0.7825          | 0.7506            | 0.7671             | 0.7503            | 0.7670             | 0.6039      | 0.6071       |
| 0.225         | 3.3974 | 7250  | 0.0513          | 0.7684         | 0.7749          | 0.7515            | 0.7689             | 0.7514            | 0.7692             | 0.5867      | 0.5894       |
| 0.1748        | 3.5145 | 7500  | 0.0502          | 0.7752         | 0.7801          | 0.7459            | 0.7630             | 0.7461            | 0.7636             | 0.5877      | 0.5949       |
| 0.2045        | 3.6317 | 7750  | 0.0512          | 0.7787         | 0.7856          | 0.7457            | 0.7636             | 0.7460            | 0.7642             | 0.6113      | 0.6156       |
| 0.1821        | 3.7488 | 8000  | 0.0502          | 0.7782         | 0.7842          | 0.7543            | 0.7707             | 0.7545            | 0.7710             | 0.6045      | 0.6069       |
| 0.1783        | 3.8660 | 8250  | 0.0491          | 0.7772         | 0.7829          | 0.7455            | 0.7630             | 0.7459            | 0.7637             | 0.5915      | 0.5984       |
| 0.2055        | 3.9831 | 8500  | 0.0504          | 0.7776         | 0.7832          | 0.7476            | 0.7658             | 0.7480            | 0.7662             | 0.5959      | 0.6017       |
| 0.1345        | 4.1003 | 8750  | 0.0467          | 0.7762         | 0.7802          | 0.7429            | 0.7606             | 0.7435            | 0.7611             | 0.6206      | 0.6303       |
| 0.1506        | 4.2174 | 9000  | 0.0477          | 0.7711         | 0.7759          | 0.7466            | 0.7625             | 0.7473            | 0.7631             | 0.5978      | 0.6025       |
| 0.1565        | 4.3346 | 9250  | 0.0477          | 0.7717         | 0.7768          | 0.7481            | 0.7641             | 0.7486            | 0.7645             | 0.6026      | 0.6102       |
| 0.1577        | 4.4517 | 9500  | 0.0442          | 0.7794         | 0.7824          | 0.7439            | 0.7627             | 0.7444            | 0.7630             | 0.6182      | 0.6291       |
| 0.1463        | 4.5689 | 9750  | 0.0456          | 0.7764         | 0.7821          | 0.7401            | 0.7602             | 0.7405            | 0.7604             | 0.5941      | 0.5991       |
| 0.16          | 4.6860 | 10000 | 0.0460          | 0.7749         | 0.7793          | 0.7495            | 0.7658             | 0.7498            | 0.7660             | 0.6140      | 0.6192       |
| 0.148         | 4.8032 | 10250 | 0.0436          | 0.7817         | 0.7855          | 0.7421            | 0.7596             | 0.7425            | 0.7601             | 0.6171      | 0.6239       |
| 0.1382        | 4.9203 | 10500 | 0.0446          | 0.7824         | 0.7872          | 0.7437            | 0.7620             | 0.7443            | 0.7625             | 0.6330      | 0.6424       |
| 0.1109        | 5.0375 | 10750 | 0.0426          | 0.7796         | 0.7846          | 0.7431            | 0.7600             | 0.7434            | 0.7602             | 0.6195      | 0.6249       |
| 0.1009        | 5.1546 | 11000 | 0.0431          | 0.7807         | 0.7835          | 0.7423            | 0.7591             | 0.7428            | 0.7591             | 0.6237      | 0.6377       |
| 0.1082        | 5.2718 | 11250 | 0.0438          | 0.7774         | 0.7818          | 0.7430            | 0.7591             | 0.7433            | 0.7593             | 0.6039      | 0.6129       |
| 0.1138        | 5.3889 | 11500 | 0.0415          | 0.7829         | 0.7870          | 0.7405            | 0.7560             | 0.7410            | 0.7561             | 0.6347      | 0.6464       |
| 0.1015        | 5.5061 | 11750 | 0.0420          | 0.7778         | 0.7811          | 0.7437            | 0.7592             | 0.7435            | 0.7589             | 0.6249      | 0.6370       |
| 0.1153        | 5.6232 | 12000 | 0.0448          | 0.7730         | 0.7784          | 0.7451            | 0.7598             | 0.7453            | 0.7596             | 0.6141      | 0.6214       |
| 0.1269        | 5.7404 | 12250 | 0.0420          | 0.7802         | 0.7840          | 0.7413            | 0.7562             | 0.7417            | 0.7564             | 0.6217      | 0.6311       |
| 0.0888        | 5.8575 | 12500 | 0.0414          | 0.7805         | 0.7841          | 0.7408            | 0.7567             | 0.7412            | 0.7568             | 0.6245      | 0.6365       |
| 0.1202        | 5.9747 | 12750 | 0.0431          | 0.7793         | 0.7835          | 0.7412            | 0.7572             | 0.7414            | 0.7575             | 0.6261      | 0.6405       |
| 0.0941        | 6.0918 | 13000 | 0.0399          | 0.7838         | 0.7873          | 0.7388            | 0.7527             | 0.7391            | 0.7530             | 0.6493      | 0.6642       |
| 0.081         | 6.2090 | 13250 | 0.0405          | 0.7814         | 0.7854          | 0.7353            | 0.7513             | 0.7355            | 0.7514             | 0.6356      | 0.6478       |
| 0.0807        | 6.3261 | 13500 | 0.0401          | 0.7838         | 0.7879          | 0.7339            | 0.7510             | 0.7344            | 0.7513             | 0.6450      | 0.6615       |
| 0.0863        | 6.4433 | 13750 | 0.0405          | 0.7814         | 0.7841          | 0.7404            | 0.7587             | 0.7408            | 0.7589             | 0.6324      | 0.6479       |
| 0.0948        | 6.5604 | 14000 | 0.0397          | 0.7830         | 0.7866          | 0.7410            | 0.7578             | 0.7415            | 0.7579             | 0.6308      | 0.6460       |
| 0.0919        | 6.6776 | 14250 | 0.0409          | 0.7820         | 0.7858          | 0.7402            | 0.7545             | 0.7403            | 0.7544             | 0.6341      | 0.6459       |
| 0.0784        | 6.7948 | 14500 | 0.0408          | 0.7794         | 0.7839          | 0.7308            | 0.7495             | 0.7312            | 0.7494             | 0.6306      | 0.6427       |
| 0.0821        | 6.9119 | 14750 | 0.0406          | 0.7789         | 0.7822          | 0.7265            | 0.7446             | 0.7270            | 0.7446             | 0.6377      | 0.6567       |
| 0.0792        | 7.0291 | 15000 | 0.0401          | 0.7800         | 0.7833          | 0.7398            | 0.7569             | 0.7405            | 0.7572             | 0.6338      | 0.6467       |
| 0.0698        | 7.1462 | 15250 | 0.0396          | 0.7822         | 0.7855          | 0.7341            | 0.7507             | 0.7346            | 0.7509             | 0.6381      | 0.6552       |
| 0.0699        | 7.2634 | 15500 | 0.0392          | 0.7820         | 0.7851          | 0.7322            | 0.7502             | 0.7325            | 0.7502             | 0.6466      | 0.6629       |
| 0.0739        | 7.3805 | 15750 | 0.0389          | 0.7865         | 0.7886          | 0.7323            | 0.7491             | 0.7328            | 0.7495             | 0.6412      | 0.6589       |
| 0.0745        | 7.4977 | 16000 | 0.0397          | 0.7794         | 0.7827          | 0.7366            | 0.7524             | 0.7373            | 0.7524             | 0.6380      | 0.6504       |
| 0.0779        | 7.6148 | 16250 | 0.0391          | 0.7826         | 0.7846          | 0.7326            | 0.7462             | 0.7333            | 0.7467             | 0.6372      | 0.6532       |
| 0.078         | 7.7320 | 16500 | 0.0397          | 0.7810         | 0.7826          | 0.7299            | 0.7461             | 0.7300            | 0.7457             | 0.6364      | 0.6555       |
| 0.0699        | 7.8491 | 16750 | 0.0405          | 0.7811         | 0.7837          | 0.7308            | 0.7468             | 0.7312            | 0.7470             | 0.6315      | 0.6426       |
| 0.0735        | 7.9663 | 17000 | 0.0394          | 0.7804         | 0.7823          | 0.7320            | 0.7455             | 0.7326            | 0.7462             | 0.6468      | 0.6607       |
| 0.0682        | 8.0834 | 17250 | 0.0386          | 0.7845         | 0.7869          | 0.7306            | 0.7447             | 0.7311            | 0.7449             | 0.6431      | 0.6613       |
| 0.0526        | 8.2006 | 17500 | 0.0389          | 0.7824         | 0.7832          | 0.7272            | 0.7431             | 0.7275            | 0.7431             | 0.6370      | 0.6539       |
| 0.0558        | 8.3177 | 17750 | 0.0385          | 0.7856         | 0.7865          | 0.7370            | 0.7513             | 0.7376            | 0.7518             | 0.6517      | 0.6679       |
| 0.0633        | 8.4349 | 18000 | 0.0392          | 0.7822         | 0.7845          | 0.7388            | 0.7537             | 0.7395            | 0.7542             | 0.6512      | 0.6664       |
| 0.0568        | 8.5520 | 18250 | 0.0389          | 0.7826         | 0.7831          | 0.7358            | 0.7510             | 0.7362            | 0.7509             | 0.6378      | 0.6536       |
| 0.0645        | 8.6692 | 18500 | 0.0377          | 0.7888         | 0.7892          | 0.7315            | 0.7495             | 0.7319            | 0.7499             | 0.6514      | 0.6704       |
| 0.0563        | 8.7863 | 18750 | 0.0376          | 0.7870         | 0.7878          | 0.7285            | 0.7451             | 0.7289            | 0.7454             | 0.6393      | 0.6606       |
| 0.0669        | 8.9035 | 19000 | 0.0383          | 0.7850         | 0.7866          | 0.7238            | 0.7433             | 0.7244            | 0.7437             | 0.6359      | 0.6571       |
| 0.0436        | 9.0206 | 19250 | 0.0377          | 0.7855         | 0.7856          | 0.7289            | 0.7462             | 0.7293            | 0.7465             | 0.6489      | 0.6696       |
| 0.047         | 9.1378 | 19500 | 0.0377          | 0.7870         | 0.7882          | 0.7249            | 0.7414             | 0.7254            | 0.7413             | 0.6459      | 0.6694       |
| 0.0482        | 9.2549 | 19750 | 0.0377          | 0.7863         | 0.7871          | 0.7296            | 0.7442             | 0.7306            | 0.7449             | 0.6498      | 0.6690       |
| 0.0529        | 9.3721 | 20000 | 0.0377          | 0.7873         | 0.7888          | 0.7285            | 0.7423             | 0.7290            | 0.7426             | 0.6490      | 0.6690       |
| 0.0429        | 9.4892 | 20250 | 0.0378          | 0.7868         | 0.7883          | 0.7286            | 0.7426             | 0.7292            | 0.7431             | 0.6503      | 0.6684       |
| 0.0534        | 9.6064 | 20500 | 0.0380          | 0.7861         | 0.7881          | 0.7300            | 0.7443             | 0.7305            | 0.7451             | 0.6446      | 0.6635       |
| 0.0531        | 9.7235 | 20750 | 0.0375          | 0.7886         | 0.7894          | 0.7350            | 0.7492             | 0.7356            | 0.7498             | 0.6442      | 0.6634       |
| 0.0464        | 9.8407 | 21000 | 0.0380          | 0.7861         | 0.7871          | 0.7314            | 0.7464             | 0.7320            | 0.7468             | 0.6415      | 0.6600       |
| 0.0406        | 9.9578 | 21250 | 0.0387          | 0.7824         | 0.7845          | 0.7335            | 0.7460             | 0.7337            | 0.7463             | 0.6362      | 0.6532       |


### Framework versions

- Transformers 4.48.0.dev0
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0