File size: 8,735 Bytes
e22a6d1
 
 
943642f
e22a6d1
 
 
 
 
 
 
 
 
 
 
 
943642f
e22a6d1
943642f
 
 
 
 
 
 
 
 
e22a6d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a6e2346
e22a6d1
 
 
 
 
 
 
 
 
a6e2346
e22a6d1
 
 
 
 
943642f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e22a6d1
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
library_name: transformers
license: apache-2.0
base_model: CocoRoF/KoModernBERT-chp-11
tags:
- generated_from_trainer
model-index:
- name: KMB_SimCSE_test
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# KMB_SimCSE_test

This model is a fine-tuned version of [CocoRoF/KoModernBERT-chp-11](https://huggingface.co/CocoRoF/KoModernBERT-chp-11) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0438
- Pearson Cosine: 0.7947
- Spearman Cosine: 0.7992
- Pearson Manhattan: 0.7493
- Spearman Manhattan: 0.7655
- Pearson Euclidean: 0.7507
- Spearman Euclidean: 0.7666
- Pearson Dot: 0.6408
- Spearman Dot: 0.6472

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 4.0

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Pearson Cosine | Spearman Cosine | Pearson Manhattan | Spearman Manhattan | Pearson Euclidean | Spearman Euclidean | Pearson Dot | Spearman Dot |
|:-------------:|:------:|:----:|:---------------:|:--------------:|:---------------:|:-----------------:|:------------------:|:-----------------:|:------------------:|:-----------:|:------------:|
| 0.761         | 0.1172 | 250  | 0.1397          | 0.7191         | 0.7366          | 0.7129            | 0.7205             | 0.7135            | 0.7210             | 0.4342      | 0.4302       |
| 0.6275        | 0.2343 | 500  | 0.1240          | 0.7535         | 0.7638          | 0.7442            | 0.7505             | 0.7442            | 0.7506             | 0.4527      | 0.4533       |
| 0.5326        | 0.3515 | 750  | 0.1149          | 0.7540         | 0.7698          | 0.7320            | 0.7461             | 0.7327            | 0.7466             | 0.4786      | 0.4737       |
| 0.4917        | 0.4686 | 1000 | 0.1028          | 0.7630         | 0.7778          | 0.7395            | 0.7532             | 0.7395            | 0.7531             | 0.5428      | 0.5404       |
| 0.4451        | 0.5858 | 1250 | 0.0959          | 0.7634         | 0.7803          | 0.7505            | 0.7649             | 0.7508            | 0.7652             | 0.5909      | 0.5929       |
| 0.4682        | 0.7029 | 1500 | 0.1057          | 0.7687         | 0.7855          | 0.7541            | 0.7681             | 0.7545            | 0.7685             | 0.5271      | 0.5190       |
| 0.4489        | 0.8201 | 1750 | 0.0994          | 0.7658         | 0.7800          | 0.7505            | 0.7624             | 0.7514            | 0.7627             | 0.5765      | 0.5760       |
| 0.4696        | 0.9372 | 2000 | 0.1055          | 0.7618         | 0.7835          | 0.7514            | 0.7669             | 0.7526            | 0.7675             | 0.5910      | 0.5835       |
| 0.3474        | 1.0544 | 2250 | 0.0818          | 0.7663         | 0.7777          | 0.7527            | 0.7636             | 0.7536            | 0.7642             | 0.5774      | 0.5748       |
| 0.319         | 1.1715 | 2500 | 0.0752          | 0.7753         | 0.7858          | 0.7589            | 0.7692             | 0.7592            | 0.7692             | 0.5929      | 0.5919       |
| 0.3682        | 1.2887 | 2750 | 0.0767          | 0.7736         | 0.7851          | 0.7556            | 0.7667             | 0.7564            | 0.7671             | 0.5784      | 0.5785       |
| 0.3033        | 1.4058 | 3000 | 0.0716          | 0.7836         | 0.7962          | 0.7590            | 0.7723             | 0.7600            | 0.7727             | 0.5987      | 0.5976       |
| 0.3247        | 1.5230 | 3250 | 0.0768          | 0.7779         | 0.7911          | 0.7613            | 0.7731             | 0.7621            | 0.7735             | 0.5638      | 0.5623       |
| 0.26          | 1.6401 | 3500 | 0.0686          | 0.7792         | 0.7902          | 0.7615            | 0.7733             | 0.7623            | 0.7734             | 0.6004      | 0.5998       |
| 0.3216        | 1.7573 | 3750 | 0.0707          | 0.7851         | 0.7950          | 0.7668            | 0.7787             | 0.7677            | 0.7791             | 0.6098      | 0.6136       |
| 0.3166        | 1.8744 | 4000 | 0.0719          | 0.7799         | 0.7911          | 0.7550            | 0.7693             | 0.7563            | 0.7701             | 0.5737      | 0.5754       |
| 0.315         | 1.9916 | 4250 | 0.0710          | 0.7818         | 0.7925          | 0.7657            | 0.7780             | 0.7672            | 0.7790             | 0.5918      | 0.5930       |
| 0.2117        | 2.1087 | 4500 | 0.0545          | 0.7772         | 0.7890          | 0.7551            | 0.7702             | 0.7567            | 0.7712             | 0.6059      | 0.6096       |
| 0.1725        | 2.2259 | 4750 | 0.0544          | 0.7780         | 0.7868          | 0.7593            | 0.7714             | 0.7605            | 0.7721             | 0.6065      | 0.6128       |
| 0.1985        | 2.3430 | 5000 | 0.0540          | 0.7818         | 0.7916          | 0.7621            | 0.7733             | 0.7626            | 0.7734             | 0.6017      | 0.6078       |
| 0.1871        | 2.4602 | 5250 | 0.0527          | 0.7830         | 0.7898          | 0.7576            | 0.7718             | 0.7587            | 0.7724             | 0.5843      | 0.5894       |
| 0.17          | 2.5773 | 5500 | 0.0521          | 0.7877         | 0.7959          | 0.7621            | 0.7746             | 0.7633            | 0.7753             | 0.6240      | 0.6246       |
| 0.174         | 2.6945 | 5750 | 0.0528          | 0.7876         | 0.7949          | 0.7594            | 0.7713             | 0.7603            | 0.7716             | 0.6196      | 0.6234       |
| 0.1896        | 2.8116 | 6000 | 0.0506          | 0.7848         | 0.7891          | 0.7595            | 0.7712             | 0.7606            | 0.7718             | 0.6052      | 0.6083       |
| 0.1897        | 2.9288 | 6250 | 0.0549          | 0.7819         | 0.7902          | 0.7521            | 0.7664             | 0.7533            | 0.7667             | 0.5957      | 0.5981       |
| 0.105         | 3.0459 | 6500 | 0.0450          | 0.7887         | 0.7931          | 0.7516            | 0.7669             | 0.7527            | 0.7675             | 0.6385      | 0.6450       |
| 0.1055        | 3.1631 | 6750 | 0.0460          | 0.7875         | 0.7927          | 0.7515            | 0.7652             | 0.7525            | 0.7657             | 0.6256      | 0.6332       |
| 0.1145        | 3.2802 | 7000 | 0.0453          | 0.7925         | 0.7977          | 0.7548            | 0.7671             | 0.7559            | 0.7678             | 0.6316      | 0.6408       |
| 0.1252        | 3.3974 | 7250 | 0.0470          | 0.7889         | 0.7947          | 0.7561            | 0.7683             | 0.7571            | 0.7693             | 0.6257      | 0.6283       |
| 0.1058        | 3.5145 | 7500 | 0.0446          | 0.7913         | 0.7958          | 0.7572            | 0.7714             | 0.7578            | 0.7715             | 0.6221      | 0.6338       |
| 0.1144        | 3.6317 | 7750 | 0.0433          | 0.7939         | 0.7989          | 0.7534            | 0.7673             | 0.7542            | 0.7677             | 0.6519      | 0.6583       |
| 0.0971        | 3.7488 | 8000 | 0.0438          | 0.7952         | 0.7993          | 0.7537            | 0.7675             | 0.7547            | 0.7679             | 0.6345      | 0.6383       |
| 0.1107        | 3.8660 | 8250 | 0.0432          | 0.7953         | 0.7992          | 0.7507            | 0.7673             | 0.7518            | 0.7675             | 0.6355      | 0.6411       |
| 0.1232        | 3.9831 | 8500 | 0.0438          | 0.7947         | 0.7992          | 0.7493            | 0.7655             | 0.7507            | 0.7666             | 0.6408      | 0.6472       |


### Framework versions

- Transformers 4.48.0.dev0
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0