CocoRoF commited on
Commit
5dba4b2
·
verified ·
1 Parent(s): 221e5a3

CocoRoF/ModernBERT-SimCSE-multitask_v04

Browse files
Files changed (3) hide show
  1. 2_Dense/model.safetensors +1 -1
  2. README.md +76 -76
  3. model.safetensors +1 -1
2_Dense/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:952a6c22e6fd47eb3c9872be6da5ff1152332bd8f6c51082eed8e3eb73962f49
3
  size 2362528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31635e07aba0bf9ff1e49bb5cec91388f57ad0a789dbc32c0b7987315304f442
3
  size 2362528
README.md CHANGED
@@ -6,7 +6,7 @@ tags:
6
  - generated_from_trainer
7
  - dataset_size:5749
8
  - loss:CosineSimilarityLoss
9
- base_model: CocoRoF/ModernBERT-SimCSE_v02
10
  widget:
11
  - source_sentence: 우리는 움직이는 동행 우주 정지 좌표계에 비례하여 이동하고 있습니다 ... 약 371km / s에서 별자리 leo
12
  쪽으로. "
@@ -48,7 +48,7 @@ metrics:
48
  - pearson_max
49
  - spearman_max
50
  model-index:
51
- - name: SentenceTransformer based on CocoRoF/ModernBERT-SimCSE_v02
52
  results:
53
  - task:
54
  type: semantic-similarity
@@ -58,46 +58,46 @@ model-index:
58
  type: sts_dev
59
  metrics:
60
  - type: pearson_cosine
61
- value: 0.8223949445074785
62
  name: Pearson Cosine
63
  - type: spearman_cosine
64
- value: 0.8220107207834706
65
  name: Spearman Cosine
66
  - type: pearson_euclidean
67
- value: 0.7785831525283676
68
  name: Pearson Euclidean
69
  - type: spearman_euclidean
70
- value: 0.7815628643916452
71
  name: Spearman Euclidean
72
  - type: pearson_manhattan
73
- value: 0.7809119630672191
74
  name: Pearson Manhattan
75
  - type: spearman_manhattan
76
- value: 0.7846536514745763
77
  name: Spearman Manhattan
78
  - type: pearson_dot
79
- value: 0.7543765794886113
80
  name: Pearson Dot
81
  - type: spearman_dot
82
- value: 0.7434525191412167
83
  name: Spearman Dot
84
  - type: pearson_max
85
- value: 0.8223949445074785
86
  name: Pearson Max
87
  - type: spearman_max
88
- value: 0.8220107207834706
89
  name: Spearman Max
90
  ---
91
 
92
- # SentenceTransformer based on CocoRoF/ModernBERT-SimCSE_v02
93
 
94
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [CocoRoF/ModernBERT-SimCSE_v02](https://huggingface.co/CocoRoF/ModernBERT-SimCSE_v02). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
95
 
96
  ## Model Details
97
 
98
  ### Model Description
99
  - **Model Type:** Sentence Transformer
100
- - **Base model:** [CocoRoF/ModernBERT-SimCSE_v02](https://huggingface.co/CocoRoF/ModernBERT-SimCSE_v02) <!-- at revision de4148c764893843e15a4e0b241fe308147a9aaa -->
101
  - **Maximum Sequence Length:** 512 tokens
102
  - **Output Dimensionality:** 768 dimensions
103
  - **Similarity Function:** Cosine Similarity
@@ -136,7 +136,7 @@ Then you can load this model and run inference.
136
  from sentence_transformers import SentenceTransformer
137
 
138
  # Download from the 🤗 Hub
139
- model = SentenceTransformer("CocoRoF/ModernBERT-SimCSE-multitask_v03")
140
  # Run inference
141
  sentences = [
142
  '버스가 바쁜 길을 따라 운전한다.',
@@ -186,18 +186,18 @@ You can finetune this model on your own dataset.
186
  * Dataset: `sts_dev`
187
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
188
 
189
- | Metric | Value |
190
- |:-------------------|:----------|
191
- | pearson_cosine | 0.8224 |
192
- | spearman_cosine | 0.822 |
193
- | pearson_euclidean | 0.7786 |
194
- | spearman_euclidean | 0.7816 |
195
- | pearson_manhattan | 0.7809 |
196
- | spearman_manhattan | 0.7847 |
197
- | pearson_dot | 0.7544 |
198
- | spearman_dot | 0.7435 |
199
- | pearson_max | 0.8224 |
200
- | **spearman_max** | **0.822** |
201
 
202
  <!--
203
  ## Bias, Risks and Limitations
@@ -224,7 +224,7 @@ You can finetune this model on your own dataset.
224
  | | sentence1 | sentence2 | score |
225
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
226
  | type | string | string | float |
227
- | details | <ul><li>min: 7 tokens</li><li>mean: 13.52 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 13.41 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
228
  * Samples:
229
  | sentence1 | sentence2 | score |
230
  |:------------------------------------|:------------------------------------------|:------------------|
@@ -249,7 +249,7 @@ You can finetune this model on your own dataset.
249
  | | sentence1 | sentence2 | score |
250
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
251
  | type | string | string | float |
252
- | details | <ul><li>min: 7 tokens</li><li>mean: 20.38 tokens</li><li>max: 52 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 20.52 tokens</li><li>max: 54 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
253
  * Samples:
254
  | sentence1 | sentence2 | score |
255
  |:-------------------------------------|:------------------------------------|:------------------|
@@ -275,7 +275,7 @@ You can finetune this model on your own dataset.
275
  - `num_train_epochs`: 10.0
276
  - `warmup_ratio`: 0.1
277
  - `push_to_hub`: True
278
- - `hub_model_id`: CocoRoF/ModernBERT-SimCSE-multitask_v03
279
  - `hub_strategy`: checkpoint
280
  - `batch_sampler`: no_duplicates
281
 
@@ -362,7 +362,7 @@ You can finetune this model on your own dataset.
362
  - `use_legacy_prediction_loop`: False
363
  - `push_to_hub`: True
364
  - `resume_from_checkpoint`: None
365
- - `hub_model_id`: CocoRoF/ModernBERT-SimCSE-multitask_v03
366
  - `hub_strategy`: checkpoint
367
  - `hub_private_repo`: None
368
  - `hub_always_push`: False
@@ -403,50 +403,50 @@ You can finetune this model on your own dataset.
403
  ### Training Logs
404
  | Epoch | Step | Training Loss | Validation Loss | sts_dev_spearman_max |
405
  |:------:|:----:|:-------------:|:---------------:|:--------------------:|
406
- | 0.2228 | 10 | 0.0283 | - | - |
407
- | 0.4457 | 20 | 0.0344 | - | - |
408
- | 0.6685 | 30 | 0.0305 | 0.0310 | 0.7939 |
409
- | 0.8914 | 40 | 0.0489 | - | - |
410
- | 1.1337 | 50 | 0.0382 | - | - |
411
- | 1.3565 | 60 | 0.0271 | 0.0293 | 0.7994 |
412
- | 1.5794 | 70 | 0.0344 | - | - |
413
- | 1.8022 | 80 | 0.0382 | - | - |
414
- | 2.0446 | 90 | 0.0419 | 0.0280 | 0.8059 |
415
- | 2.2674 | 100 | 0.0244 | - | - |
416
- | 2.4903 | 110 | 0.0307 | - | - |
417
- | 2.7131 | 120 | 0.0291 | 0.0269 | 0.8108 |
418
- | 2.9359 | 130 | 0.038 | - | - |
419
- | 3.1783 | 140 | 0.0269 | - | - |
420
- | 3.4011 | 150 | 0.0268 | 0.0262 | 0.8155 |
421
- | 3.6240 | 160 | 0.0246 | - | - |
422
- | 3.8468 | 170 | 0.0313 | - | - |
423
- | 4.0891 | 180 | 0.0303 | 0.0259 | 0.8185 |
424
- | 4.3120 | 190 | 0.0198 | - | - |
425
- | 4.5348 | 200 | 0.0257 | - | - |
426
- | 4.7577 | 210 | 0.0242 | 0.0255 | 0.8202 |
427
- | 4.9805 | 220 | 0.0293 | - | - |
428
- | 5.2228 | 230 | 0.0193 | - | - |
429
- | 5.4457 | 240 | 0.0222 | 0.0254 | 0.8222 |
430
- | 5.6685 | 250 | 0.0184 | - | - |
431
- | 5.8914 | 260 | 0.0243 | - | - |
432
- | 6.1337 | 270 | 0.0204 | 0.0254 | 0.8235 |
433
- | 6.3565 | 280 | 0.0147 | - | - |
434
- | 6.5794 | 290 | 0.0196 | - | - |
435
- | 6.8022 | 300 | 0.0176 | 0.0253 | 0.8227 |
436
- | 7.0446 | 310 | 0.0202 | - | - |
437
- | 7.2674 | 320 | 0.0123 | - | - |
438
- | 7.4903 | 330 | 0.0151 | 0.0254 | 0.8236 |
439
- | 7.7131 | 340 | 0.0132 | - | - |
440
- | 7.9359 | 350 | 0.0158 | - | - |
441
- | 8.1783 | 360 | 0.0118 | 0.0256 | 0.8240 |
442
- | 8.4011 | 370 | 0.0115 | - | - |
443
- | 8.6240 | 380 | 0.0105 | - | - |
444
- | 8.8468 | 390 | 0.0111 | 0.0256 | 0.8215 |
445
- | 9.0891 | 400 | 0.011 | - | - |
446
- | 9.3120 | 410 | 0.0076 | - | - |
447
- | 9.5348 | 420 | 0.0091 | 0.0256 | 0.8220 |
448
- | 9.7577 | 430 | 0.0075 | - | - |
449
- | 9.9805 | 440 | 0.0093 | - | - |
450
 
451
 
452
  ### Framework Versions
 
6
  - generated_from_trainer
7
  - dataset_size:5749
8
  - loss:CosineSimilarityLoss
9
+ base_model: CocoRoF/ModernBERT-SimCSE_v04
10
  widget:
11
  - source_sentence: 우리는 움직이는 동행 우주 정지 좌표계에 비례하여 이동하고 있습니다 ... 약 371km / s에서 별자리 leo
12
  쪽으로. "
 
48
  - pearson_max
49
  - spearman_max
50
  model-index:
51
+ - name: SentenceTransformer based on CocoRoF/ModernBERT-SimCSE_v04
52
  results:
53
  - task:
54
  type: semantic-similarity
 
58
  type: sts_dev
59
  metrics:
60
  - type: pearson_cosine
61
+ value: 0.7846905549925053
62
  name: Pearson Cosine
63
  - type: spearman_cosine
64
+ value: 0.7871247667333137
65
  name: Spearman Cosine
66
  - type: pearson_euclidean
67
+ value: 0.7258848709796941
68
  name: Pearson Euclidean
69
  - type: spearman_euclidean
70
+ value: 0.7208562515791448
71
  name: Spearman Euclidean
72
  - type: pearson_manhattan
73
+ value: 0.7251869665655273
74
  name: Pearson Manhattan
75
  - type: spearman_manhattan
76
+ value: 0.7202883259106225
77
  name: Spearman Manhattan
78
  - type: pearson_dot
79
+ value: 0.62098630425604
80
  name: Pearson Dot
81
  - type: spearman_dot
82
+ value: 0.6254562421139086
83
  name: Spearman Dot
84
  - type: pearson_max
85
+ value: 0.7846905549925053
86
  name: Pearson Max
87
  - type: spearman_max
88
+ value: 0.7871247667333137
89
  name: Spearman Max
90
  ---
91
 
92
+ # SentenceTransformer based on CocoRoF/ModernBERT-SimCSE_v04
93
 
94
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [CocoRoF/ModernBERT-SimCSE_v04](https://huggingface.co/CocoRoF/ModernBERT-SimCSE_v04). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
95
 
96
  ## Model Details
97
 
98
  ### Model Description
99
  - **Model Type:** Sentence Transformer
100
+ - **Base model:** [CocoRoF/ModernBERT-SimCSE_v04](https://huggingface.co/CocoRoF/ModernBERT-SimCSE_v04) <!-- at revision 7d23b869258e5c726c0f536bccac7e873d510d66 -->
101
  - **Maximum Sequence Length:** 512 tokens
102
  - **Output Dimensionality:** 768 dimensions
103
  - **Similarity Function:** Cosine Similarity
 
136
  from sentence_transformers import SentenceTransformer
137
 
138
  # Download from the 🤗 Hub
139
+ model = SentenceTransformer("CocoRoF/ModernBERT-SimCSE-multitask_v04")
140
  # Run inference
141
  sentences = [
142
  '버스가 바쁜 길을 따라 운전한다.',
 
186
  * Dataset: `sts_dev`
187
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
188
 
189
+ | Metric | Value |
190
+ |:-------------------|:-----------|
191
+ | pearson_cosine | 0.7847 |
192
+ | spearman_cosine | 0.7871 |
193
+ | pearson_euclidean | 0.7259 |
194
+ | spearman_euclidean | 0.7209 |
195
+ | pearson_manhattan | 0.7252 |
196
+ | spearman_manhattan | 0.7203 |
197
+ | pearson_dot | 0.621 |
198
+ | spearman_dot | 0.6255 |
199
+ | pearson_max | 0.7847 |
200
+ | **spearman_max** | **0.7871** |
201
 
202
  <!--
203
  ## Bias, Risks and Limitations
 
224
  | | sentence1 | sentence2 | score |
225
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
226
  | type | string | string | float |
227
+ | details | <ul><li>min: 7 tokens</li><li>mean: 12.69 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 12.56 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
228
  * Samples:
229
  | sentence1 | sentence2 | score |
230
  |:------------------------------------|:------------------------------------------|:------------------|
 
249
  | | sentence1 | sentence2 | score |
250
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
251
  | type | string | string | float |
252
+ | details | <ul><li>min: 6 tokens</li><li>mean: 18.89 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.92 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
253
  * Samples:
254
  | sentence1 | sentence2 | score |
255
  |:-------------------------------------|:------------------------------------|:------------------|
 
275
  - `num_train_epochs`: 10.0
276
  - `warmup_ratio`: 0.1
277
  - `push_to_hub`: True
278
+ - `hub_model_id`: CocoRoF/ModernBERT-SimCSE-multitask_v04
279
  - `hub_strategy`: checkpoint
280
  - `batch_sampler`: no_duplicates
281
 
 
362
  - `use_legacy_prediction_loop`: False
363
  - `push_to_hub`: True
364
  - `resume_from_checkpoint`: None
365
+ - `hub_model_id`: CocoRoF/ModernBERT-SimCSE-multitask_v04
366
  - `hub_strategy`: checkpoint
367
  - `hub_private_repo`: None
368
  - `hub_always_push`: False
 
403
  ### Training Logs
404
  | Epoch | Step | Training Loss | Validation Loss | sts_dev_spearman_max |
405
  |:------:|:----:|:-------------:|:---------------:|:--------------------:|
406
+ | 0.2228 | 10 | 0.0285 | - | - |
407
+ | 0.4457 | 20 | 0.0396 | - | - |
408
+ | 0.6685 | 30 | 0.0396 | 0.0376 | 0.7647 |
409
+ | 0.8914 | 40 | 0.0594 | - | - |
410
+ | 1.1337 | 50 | 0.0438 | - | - |
411
+ | 1.3565 | 60 | 0.0302 | 0.0358 | 0.7723 |
412
+ | 1.5794 | 70 | 0.0398 | - | - |
413
+ | 1.8022 | 80 | 0.0457 | - | - |
414
+ | 2.0446 | 90 | 0.0464 | 0.0347 | 0.7805 |
415
+ | 2.2674 | 100 | 0.026 | - | - |
416
+ | 2.4903 | 110 | 0.0331 | - | - |
417
+ | 2.7131 | 120 | 0.0318 | 0.0329 | 0.7837 |
418
+ | 2.9359 | 130 | 0.0399 | - | - |
419
+ | 3.1783 | 140 | 0.0264 | - | - |
420
+ | 3.4011 | 150 | 0.0268 | 0.0332 | 0.7884 |
421
+ | 3.6240 | 160 | 0.0241 | - | - |
422
+ | 3.8468 | 170 | 0.0309 | - | - |
423
+ | 4.0891 | 180 | 0.0263 | 0.0326 | 0.7918 |
424
+ | 4.3120 | 190 | 0.0164 | - | - |
425
+ | 4.5348 | 200 | 0.0226 | - | - |
426
+ | 4.7577 | 210 | 0.0196 | 0.0314 | 0.7896 |
427
+ | 4.9805 | 220 | 0.0217 | - | - |
428
+ | 5.2228 | 230 | 0.0134 | - | - |
429
+ | 5.4457 | 240 | 0.0157 | 0.0320 | 0.7911 |
430
+ | 5.6685 | 250 | 0.0136 | - | - |
431
+ | 5.8914 | 260 | 0.0143 | - | - |
432
+ | 6.1337 | 270 | 0.0114 | 0.0322 | 0.7907 |
433
+ | 6.3565 | 280 | 0.0077 | - | - |
434
+ | 6.5794 | 290 | 0.0116 | - | - |
435
+ | 6.8022 | 300 | 0.0087 | 0.0313 | 0.7868 |
436
+ | 7.0446 | 310 | 0.0088 | - | - |
437
+ | 7.2674 | 320 | 0.0048 | - | - |
438
+ | 7.4903 | 330 | 0.0068 | 0.0317 | 0.7895 |
439
+ | 7.7131 | 340 | 0.006 | - | - |
440
+ | 7.9359 | 350 | 0.0051 | - | - |
441
+ | 8.1783 | 360 | 0.0039 | 0.0323 | 0.7882 |
442
+ | 8.4011 | 370 | 0.0036 | - | - |
443
+ | 8.6240 | 380 | 0.0045 | - | - |
444
+ | 8.8468 | 390 | 0.0032 | 0.0317 | 0.7841 |
445
+ | 9.0891 | 400 | 0.0031 | - | - |
446
+ | 9.3120 | 410 | 0.0021 | - | - |
447
+ | 9.5348 | 420 | 0.0029 | 0.0323 | 0.7871 |
448
+ | 9.7577 | 430 | 0.0023 | - | - |
449
+ | 9.9805 | 440 | 0.0027 | - | - |
450
 
451
 
452
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4d9b65e72c69ee7ad20852d629dd9265d4a591df173662edc0ed58bcefc3cbeb
3
  size 610640632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0869c16bd8ae16b638ef0de4e504f3e8f3a1c215f6ed1b812d8aa22835f41aff
3
  size 610640632