CocoRoF/ModernBERT-SimCSE-multitask_v04
Browse files- 2_Dense/model.safetensors +1 -1
- README.md +76 -76
- model.safetensors +1 -1
2_Dense/model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 2362528
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:31635e07aba0bf9ff1e49bb5cec91388f57ad0a789dbc32c0b7987315304f442
|
| 3 |
size 2362528
|
README.md
CHANGED
|
@@ -6,7 +6,7 @@ tags:
|
|
| 6 |
- generated_from_trainer
|
| 7 |
- dataset_size:5749
|
| 8 |
- loss:CosineSimilarityLoss
|
| 9 |
-
base_model: CocoRoF/ModernBERT-
|
| 10 |
widget:
|
| 11 |
- source_sentence: 우리는 움직이는 동행 우주 정지 좌표계에 비례하여 이동하고 있습니다 ... 약 371km / s에서 별자리 leo
|
| 12 |
쪽으로. "
|
|
@@ -48,7 +48,7 @@ metrics:
|
|
| 48 |
- pearson_max
|
| 49 |
- spearman_max
|
| 50 |
model-index:
|
| 51 |
-
- name: SentenceTransformer based on CocoRoF/ModernBERT-
|
| 52 |
results:
|
| 53 |
- task:
|
| 54 |
type: semantic-similarity
|
|
@@ -58,46 +58,46 @@ model-index:
|
|
| 58 |
type: sts_dev
|
| 59 |
metrics:
|
| 60 |
- type: pearson_cosine
|
| 61 |
-
value: 0.
|
| 62 |
name: Pearson Cosine
|
| 63 |
- type: spearman_cosine
|
| 64 |
-
value: 0.
|
| 65 |
name: Spearman Cosine
|
| 66 |
- type: pearson_euclidean
|
| 67 |
-
value: 0.
|
| 68 |
name: Pearson Euclidean
|
| 69 |
- type: spearman_euclidean
|
| 70 |
-
value: 0.
|
| 71 |
name: Spearman Euclidean
|
| 72 |
- type: pearson_manhattan
|
| 73 |
-
value: 0.
|
| 74 |
name: Pearson Manhattan
|
| 75 |
- type: spearman_manhattan
|
| 76 |
-
value: 0.
|
| 77 |
name: Spearman Manhattan
|
| 78 |
- type: pearson_dot
|
| 79 |
-
value: 0.
|
| 80 |
name: Pearson Dot
|
| 81 |
- type: spearman_dot
|
| 82 |
-
value: 0.
|
| 83 |
name: Spearman Dot
|
| 84 |
- type: pearson_max
|
| 85 |
-
value: 0.
|
| 86 |
name: Pearson Max
|
| 87 |
- type: spearman_max
|
| 88 |
-
value: 0.
|
| 89 |
name: Spearman Max
|
| 90 |
---
|
| 91 |
|
| 92 |
-
# SentenceTransformer based on CocoRoF/ModernBERT-
|
| 93 |
|
| 94 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [CocoRoF/ModernBERT-
|
| 95 |
|
| 96 |
## Model Details
|
| 97 |
|
| 98 |
### Model Description
|
| 99 |
- **Model Type:** Sentence Transformer
|
| 100 |
-
- **Base model:** [CocoRoF/ModernBERT-
|
| 101 |
- **Maximum Sequence Length:** 512 tokens
|
| 102 |
- **Output Dimensionality:** 768 dimensions
|
| 103 |
- **Similarity Function:** Cosine Similarity
|
|
@@ -136,7 +136,7 @@ Then you can load this model and run inference.
|
|
| 136 |
from sentence_transformers import SentenceTransformer
|
| 137 |
|
| 138 |
# Download from the 🤗 Hub
|
| 139 |
-
model = SentenceTransformer("CocoRoF/ModernBERT-SimCSE-
|
| 140 |
# Run inference
|
| 141 |
sentences = [
|
| 142 |
'버스가 바쁜 길을 따라 운전한다.',
|
|
@@ -186,18 +186,18 @@ You can finetune this model on your own dataset.
|
|
| 186 |
* Dataset: `sts_dev`
|
| 187 |
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
| 188 |
|
| 189 |
-
| Metric | Value
|
| 190 |
-
|
| 191 |
-
| pearson_cosine | 0.
|
| 192 |
-
| spearman_cosine | 0.
|
| 193 |
-
| pearson_euclidean | 0.
|
| 194 |
-
| spearman_euclidean | 0.
|
| 195 |
-
| pearson_manhattan | 0.
|
| 196 |
-
| spearman_manhattan | 0.
|
| 197 |
-
| pearson_dot | 0.
|
| 198 |
-
| spearman_dot | 0.
|
| 199 |
-
| pearson_max | 0.
|
| 200 |
-
| **spearman_max** | **0.
|
| 201 |
|
| 202 |
<!--
|
| 203 |
## Bias, Risks and Limitations
|
|
@@ -224,7 +224,7 @@ You can finetune this model on your own dataset.
|
|
| 224 |
| | sentence1 | sentence2 | score |
|
| 225 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 226 |
| type | string | string | float |
|
| 227 |
-
| details | <ul><li>min: 7 tokens</li><li>mean:
|
| 228 |
* Samples:
|
| 229 |
| sentence1 | sentence2 | score |
|
| 230 |
|:------------------------------------|:------------------------------------------|:------------------|
|
|
@@ -249,7 +249,7 @@ You can finetune this model on your own dataset.
|
|
| 249 |
| | sentence1 | sentence2 | score |
|
| 250 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 251 |
| type | string | string | float |
|
| 252 |
-
| details | <ul><li>min:
|
| 253 |
* Samples:
|
| 254 |
| sentence1 | sentence2 | score |
|
| 255 |
|:-------------------------------------|:------------------------------------|:------------------|
|
|
@@ -275,7 +275,7 @@ You can finetune this model on your own dataset.
|
|
| 275 |
- `num_train_epochs`: 10.0
|
| 276 |
- `warmup_ratio`: 0.1
|
| 277 |
- `push_to_hub`: True
|
| 278 |
-
- `hub_model_id`: CocoRoF/ModernBERT-SimCSE-
|
| 279 |
- `hub_strategy`: checkpoint
|
| 280 |
- `batch_sampler`: no_duplicates
|
| 281 |
|
|
@@ -362,7 +362,7 @@ You can finetune this model on your own dataset.
|
|
| 362 |
- `use_legacy_prediction_loop`: False
|
| 363 |
- `push_to_hub`: True
|
| 364 |
- `resume_from_checkpoint`: None
|
| 365 |
-
- `hub_model_id`: CocoRoF/ModernBERT-SimCSE-
|
| 366 |
- `hub_strategy`: checkpoint
|
| 367 |
- `hub_private_repo`: None
|
| 368 |
- `hub_always_push`: False
|
|
@@ -403,50 +403,50 @@ You can finetune this model on your own dataset.
|
|
| 403 |
### Training Logs
|
| 404 |
| Epoch | Step | Training Loss | Validation Loss | sts_dev_spearman_max |
|
| 405 |
|:------:|:----:|:-------------:|:---------------:|:--------------------:|
|
| 406 |
-
| 0.2228 | 10 | 0.
|
| 407 |
-
| 0.4457 | 20 | 0.
|
| 408 |
-
| 0.6685 | 30 | 0.
|
| 409 |
-
| 0.8914 | 40 | 0.
|
| 410 |
-
| 1.1337 | 50 | 0.
|
| 411 |
-
| 1.3565 | 60 | 0.
|
| 412 |
-
| 1.5794 | 70 | 0.
|
| 413 |
-
| 1.8022 | 80 | 0.
|
| 414 |
-
| 2.0446 | 90 | 0.
|
| 415 |
-
| 2.2674 | 100 | 0.
|
| 416 |
-
| 2.4903 | 110 | 0.
|
| 417 |
-
| 2.7131 | 120 | 0.
|
| 418 |
-
| 2.9359 | 130 | 0.
|
| 419 |
-
| 3.1783 | 140 | 0.
|
| 420 |
-
| 3.4011 | 150 | 0.0268 | 0.
|
| 421 |
-
| 3.6240 | 160 | 0.
|
| 422 |
-
| 3.8468 | 170 | 0.
|
| 423 |
-
| 4.0891 | 180 | 0.
|
| 424 |
-
| 4.3120 | 190 | 0.
|
| 425 |
-
| 4.5348 | 200 | 0.
|
| 426 |
-
| 4.7577 | 210 | 0.
|
| 427 |
-
| 4.9805 | 220 | 0.
|
| 428 |
-
| 5.2228 | 230 | 0.
|
| 429 |
-
| 5.4457 | 240 | 0.
|
| 430 |
-
| 5.6685 | 250 | 0.
|
| 431 |
-
| 5.8914 | 260 | 0.
|
| 432 |
-
| 6.1337 | 270 | 0.
|
| 433 |
-
| 6.3565 | 280 | 0.
|
| 434 |
-
| 6.5794 | 290 | 0.
|
| 435 |
-
| 6.8022 | 300 | 0.
|
| 436 |
-
| 7.0446 | 310 | 0.
|
| 437 |
-
| 7.2674 | 320 | 0.
|
| 438 |
-
| 7.4903 | 330 | 0.
|
| 439 |
-
| 7.7131 | 340 | 0.
|
| 440 |
-
| 7.9359 | 350 | 0.
|
| 441 |
-
| 8.1783 | 360 | 0.
|
| 442 |
-
| 8.4011 | 370 | 0.
|
| 443 |
-
| 8.6240 | 380 | 0.
|
| 444 |
-
| 8.8468 | 390 | 0.
|
| 445 |
-
| 9.0891 | 400 | 0.
|
| 446 |
-
| 9.3120 | 410 | 0.
|
| 447 |
-
| 9.5348 | 420 | 0.
|
| 448 |
-
| 9.7577 | 430 | 0.
|
| 449 |
-
| 9.9805 | 440 | 0.
|
| 450 |
|
| 451 |
|
| 452 |
### Framework Versions
|
|
|
|
| 6 |
- generated_from_trainer
|
| 7 |
- dataset_size:5749
|
| 8 |
- loss:CosineSimilarityLoss
|
| 9 |
+
base_model: CocoRoF/ModernBERT-SimCSE_v04
|
| 10 |
widget:
|
| 11 |
- source_sentence: 우리는 움직이는 동행 우주 정지 좌표계에 비례하여 이동하고 있습니다 ... 약 371km / s에서 별자리 leo
|
| 12 |
쪽으로. "
|
|
|
|
| 48 |
- pearson_max
|
| 49 |
- spearman_max
|
| 50 |
model-index:
|
| 51 |
+
- name: SentenceTransformer based on CocoRoF/ModernBERT-SimCSE_v04
|
| 52 |
results:
|
| 53 |
- task:
|
| 54 |
type: semantic-similarity
|
|
|
|
| 58 |
type: sts_dev
|
| 59 |
metrics:
|
| 60 |
- type: pearson_cosine
|
| 61 |
+
value: 0.7846905549925053
|
| 62 |
name: Pearson Cosine
|
| 63 |
- type: spearman_cosine
|
| 64 |
+
value: 0.7871247667333137
|
| 65 |
name: Spearman Cosine
|
| 66 |
- type: pearson_euclidean
|
| 67 |
+
value: 0.7258848709796941
|
| 68 |
name: Pearson Euclidean
|
| 69 |
- type: spearman_euclidean
|
| 70 |
+
value: 0.7208562515791448
|
| 71 |
name: Spearman Euclidean
|
| 72 |
- type: pearson_manhattan
|
| 73 |
+
value: 0.7251869665655273
|
| 74 |
name: Pearson Manhattan
|
| 75 |
- type: spearman_manhattan
|
| 76 |
+
value: 0.7202883259106225
|
| 77 |
name: Spearman Manhattan
|
| 78 |
- type: pearson_dot
|
| 79 |
+
value: 0.62098630425604
|
| 80 |
name: Pearson Dot
|
| 81 |
- type: spearman_dot
|
| 82 |
+
value: 0.6254562421139086
|
| 83 |
name: Spearman Dot
|
| 84 |
- type: pearson_max
|
| 85 |
+
value: 0.7846905549925053
|
| 86 |
name: Pearson Max
|
| 87 |
- type: spearman_max
|
| 88 |
+
value: 0.7871247667333137
|
| 89 |
name: Spearman Max
|
| 90 |
---
|
| 91 |
|
| 92 |
+
# SentenceTransformer based on CocoRoF/ModernBERT-SimCSE_v04
|
| 93 |
|
| 94 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [CocoRoF/ModernBERT-SimCSE_v04](https://huggingface.co/CocoRoF/ModernBERT-SimCSE_v04). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 95 |
|
| 96 |
## Model Details
|
| 97 |
|
| 98 |
### Model Description
|
| 99 |
- **Model Type:** Sentence Transformer
|
| 100 |
+
- **Base model:** [CocoRoF/ModernBERT-SimCSE_v04](https://huggingface.co/CocoRoF/ModernBERT-SimCSE_v04) <!-- at revision 7d23b869258e5c726c0f536bccac7e873d510d66 -->
|
| 101 |
- **Maximum Sequence Length:** 512 tokens
|
| 102 |
- **Output Dimensionality:** 768 dimensions
|
| 103 |
- **Similarity Function:** Cosine Similarity
|
|
|
|
| 136 |
from sentence_transformers import SentenceTransformer
|
| 137 |
|
| 138 |
# Download from the 🤗 Hub
|
| 139 |
+
model = SentenceTransformer("CocoRoF/ModernBERT-SimCSE-multitask_v04")
|
| 140 |
# Run inference
|
| 141 |
sentences = [
|
| 142 |
'버스가 바쁜 길을 따라 운전한다.',
|
|
|
|
| 186 |
* Dataset: `sts_dev`
|
| 187 |
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
| 188 |
|
| 189 |
+
| Metric | Value |
|
| 190 |
+
|:-------------------|:-----------|
|
| 191 |
+
| pearson_cosine | 0.7847 |
|
| 192 |
+
| spearman_cosine | 0.7871 |
|
| 193 |
+
| pearson_euclidean | 0.7259 |
|
| 194 |
+
| spearman_euclidean | 0.7209 |
|
| 195 |
+
| pearson_manhattan | 0.7252 |
|
| 196 |
+
| spearman_manhattan | 0.7203 |
|
| 197 |
+
| pearson_dot | 0.621 |
|
| 198 |
+
| spearman_dot | 0.6255 |
|
| 199 |
+
| pearson_max | 0.7847 |
|
| 200 |
+
| **spearman_max** | **0.7871** |
|
| 201 |
|
| 202 |
<!--
|
| 203 |
## Bias, Risks and Limitations
|
|
|
|
| 224 |
| | sentence1 | sentence2 | score |
|
| 225 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 226 |
| type | string | string | float |
|
| 227 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 12.69 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 12.56 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
|
| 228 |
* Samples:
|
| 229 |
| sentence1 | sentence2 | score |
|
| 230 |
|:------------------------------------|:------------------------------------------|:------------------|
|
|
|
|
| 249 |
| | sentence1 | sentence2 | score |
|
| 250 |
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 251 |
| type | string | string | float |
|
| 252 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 18.89 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.92 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
|
| 253 |
* Samples:
|
| 254 |
| sentence1 | sentence2 | score |
|
| 255 |
|:-------------------------------------|:------------------------------------|:------------------|
|
|
|
|
| 275 |
- `num_train_epochs`: 10.0
|
| 276 |
- `warmup_ratio`: 0.1
|
| 277 |
- `push_to_hub`: True
|
| 278 |
+
- `hub_model_id`: CocoRoF/ModernBERT-SimCSE-multitask_v04
|
| 279 |
- `hub_strategy`: checkpoint
|
| 280 |
- `batch_sampler`: no_duplicates
|
| 281 |
|
|
|
|
| 362 |
- `use_legacy_prediction_loop`: False
|
| 363 |
- `push_to_hub`: True
|
| 364 |
- `resume_from_checkpoint`: None
|
| 365 |
+
- `hub_model_id`: CocoRoF/ModernBERT-SimCSE-multitask_v04
|
| 366 |
- `hub_strategy`: checkpoint
|
| 367 |
- `hub_private_repo`: None
|
| 368 |
- `hub_always_push`: False
|
|
|
|
| 403 |
### Training Logs
|
| 404 |
| Epoch | Step | Training Loss | Validation Loss | sts_dev_spearman_max |
|
| 405 |
|:------:|:----:|:-------------:|:---------------:|:--------------------:|
|
| 406 |
+
| 0.2228 | 10 | 0.0285 | - | - |
|
| 407 |
+
| 0.4457 | 20 | 0.0396 | - | - |
|
| 408 |
+
| 0.6685 | 30 | 0.0396 | 0.0376 | 0.7647 |
|
| 409 |
+
| 0.8914 | 40 | 0.0594 | - | - |
|
| 410 |
+
| 1.1337 | 50 | 0.0438 | - | - |
|
| 411 |
+
| 1.3565 | 60 | 0.0302 | 0.0358 | 0.7723 |
|
| 412 |
+
| 1.5794 | 70 | 0.0398 | - | - |
|
| 413 |
+
| 1.8022 | 80 | 0.0457 | - | - |
|
| 414 |
+
| 2.0446 | 90 | 0.0464 | 0.0347 | 0.7805 |
|
| 415 |
+
| 2.2674 | 100 | 0.026 | - | - |
|
| 416 |
+
| 2.4903 | 110 | 0.0331 | - | - |
|
| 417 |
+
| 2.7131 | 120 | 0.0318 | 0.0329 | 0.7837 |
|
| 418 |
+
| 2.9359 | 130 | 0.0399 | - | - |
|
| 419 |
+
| 3.1783 | 140 | 0.0264 | - | - |
|
| 420 |
+
| 3.4011 | 150 | 0.0268 | 0.0332 | 0.7884 |
|
| 421 |
+
| 3.6240 | 160 | 0.0241 | - | - |
|
| 422 |
+
| 3.8468 | 170 | 0.0309 | - | - |
|
| 423 |
+
| 4.0891 | 180 | 0.0263 | 0.0326 | 0.7918 |
|
| 424 |
+
| 4.3120 | 190 | 0.0164 | - | - |
|
| 425 |
+
| 4.5348 | 200 | 0.0226 | - | - |
|
| 426 |
+
| 4.7577 | 210 | 0.0196 | 0.0314 | 0.7896 |
|
| 427 |
+
| 4.9805 | 220 | 0.0217 | - | - |
|
| 428 |
+
| 5.2228 | 230 | 0.0134 | - | - |
|
| 429 |
+
| 5.4457 | 240 | 0.0157 | 0.0320 | 0.7911 |
|
| 430 |
+
| 5.6685 | 250 | 0.0136 | - | - |
|
| 431 |
+
| 5.8914 | 260 | 0.0143 | - | - |
|
| 432 |
+
| 6.1337 | 270 | 0.0114 | 0.0322 | 0.7907 |
|
| 433 |
+
| 6.3565 | 280 | 0.0077 | - | - |
|
| 434 |
+
| 6.5794 | 290 | 0.0116 | - | - |
|
| 435 |
+
| 6.8022 | 300 | 0.0087 | 0.0313 | 0.7868 |
|
| 436 |
+
| 7.0446 | 310 | 0.0088 | - | - |
|
| 437 |
+
| 7.2674 | 320 | 0.0048 | - | - |
|
| 438 |
+
| 7.4903 | 330 | 0.0068 | 0.0317 | 0.7895 |
|
| 439 |
+
| 7.7131 | 340 | 0.006 | - | - |
|
| 440 |
+
| 7.9359 | 350 | 0.0051 | - | - |
|
| 441 |
+
| 8.1783 | 360 | 0.0039 | 0.0323 | 0.7882 |
|
| 442 |
+
| 8.4011 | 370 | 0.0036 | - | - |
|
| 443 |
+
| 8.6240 | 380 | 0.0045 | - | - |
|
| 444 |
+
| 8.8468 | 390 | 0.0032 | 0.0317 | 0.7841 |
|
| 445 |
+
| 9.0891 | 400 | 0.0031 | - | - |
|
| 446 |
+
| 9.3120 | 410 | 0.0021 | - | - |
|
| 447 |
+
| 9.5348 | 420 | 0.0029 | 0.0323 | 0.7871 |
|
| 448 |
+
| 9.7577 | 430 | 0.0023 | - | - |
|
| 449 |
+
| 9.9805 | 440 | 0.0027 | - | - |
|
| 450 |
|
| 451 |
|
| 452 |
### Framework Versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 610640632
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0869c16bd8ae16b638ef0de4e504f3e8f3a1c215f6ed1b812d8aa22835f41aff
|
| 3 |
size 610640632
|