benjamintli commited on
Commit
8d7c40a
·
verified ·
1 Parent(s): ece9591

End of training

Browse files
Files changed (1) hide show
  1. README.md +57 -57
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  - generated_from_trainer
8
  - dataset_size:8118
9
  - loss:CachedMultipleNegativesRankingLoss
10
- base_model: answerdotai/ModernBERT-base
11
  widget:
12
  - source_sentence: python create path if doesnt exist
13
  sentences:
@@ -101,7 +101,7 @@ metrics:
101
  - cosine_mrr@10
102
  - cosine_map@100
103
  model-index:
104
- - name: SentenceTransformer based on answerdotai/ModernBERT-base
105
  results:
106
  - task:
107
  type: information-retrieval
@@ -111,61 +111,61 @@ model-index:
111
  type: eval
112
  metrics:
113
  - type: cosine_accuracy@1
114
- value: 0.61529933481153
115
  name: Cosine Accuracy@1
116
  - type: cosine_accuracy@3
117
- value: 0.8791574279379157
118
  name: Cosine Accuracy@3
119
  - type: cosine_accuracy@5
120
- value: 0.9356984478935698
121
  name: Cosine Accuracy@5
122
  - type: cosine_accuracy@10
123
- value: 0.9733924611973392
124
  name: Cosine Accuracy@10
125
  - type: cosine_precision@1
126
- value: 0.61529933481153
127
  name: Cosine Precision@1
128
  - type: cosine_precision@3
129
- value: 0.2930524759793052
130
  name: Cosine Precision@3
131
  - type: cosine_precision@5
132
- value: 0.187139689578714
133
  name: Cosine Precision@5
134
  - type: cosine_precision@10
135
- value: 0.09733924611973392
136
  name: Cosine Precision@10
137
  - type: cosine_recall@1
138
- value: 0.61529933481153
139
  name: Cosine Recall@1
140
  - type: cosine_recall@3
141
- value: 0.8791574279379157
142
  name: Cosine Recall@3
143
  - type: cosine_recall@5
144
- value: 0.9356984478935698
145
  name: Cosine Recall@5
146
  - type: cosine_recall@10
147
- value: 0.9733924611973392
148
  name: Cosine Recall@10
149
  - type: cosine_ndcg@10
150
- value: 0.8075594888103552
151
  name: Cosine Ndcg@10
152
  - type: cosine_mrr@10
153
- value: 0.7526867103086619
154
  name: Cosine Mrr@10
155
  - type: cosine_map@100
156
- value: 0.7539697212778319
157
  name: Cosine Map@100
158
  ---
159
 
160
- # SentenceTransformer based on answerdotai/ModernBERT-base
161
 
162
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
163
 
164
  ## Model Details
165
 
166
  ### Model Description
167
  - **Model Type:** Sentence Transformer
168
- - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
169
  - **Maximum Sequence Length:** 512 tokens
170
  - **Output Dimensionality:** 768 dimensions
171
  - **Similarity Function:** Cosine Similarity
@@ -221,7 +221,7 @@ print(query_embeddings.shape, document_embeddings.shape)
221
  # Get the similarity scores for the embeddings
222
  similarities = model.similarity(query_embeddings, document_embeddings)
223
  print(similarities)
224
- # tensor([[0.6000, 0.0149, 0.0027]])
225
  ```
226
 
227
  <!--
@@ -259,21 +259,21 @@ You can finetune this model on your own dataset.
259
 
260
  | Metric | Value |
261
  |:--------------------|:-----------|
262
- | cosine_accuracy@1 | 0.6153 |
263
- | cosine_accuracy@3 | 0.8792 |
264
- | cosine_accuracy@5 | 0.9357 |
265
- | cosine_accuracy@10 | 0.9734 |
266
- | cosine_precision@1 | 0.6153 |
267
- | cosine_precision@3 | 0.2931 |
268
- | cosine_precision@5 | 0.1871 |
269
- | cosine_precision@10 | 0.0973 |
270
- | cosine_recall@1 | 0.6153 |
271
- | cosine_recall@3 | 0.8792 |
272
- | cosine_recall@5 | 0.9357 |
273
- | cosine_recall@10 | 0.9734 |
274
- | **cosine_ndcg@10** | **0.8076** |
275
- | cosine_mrr@10 | 0.7527 |
276
- | cosine_map@100 | 0.754 |
277
 
278
  <!--
279
  ## Bias, Risks and Limitations
@@ -360,7 +360,7 @@ You can finetune this model on your own dataset.
360
 
361
  - `per_device_train_batch_size`: 1024
362
  - `num_train_epochs`: 10
363
- - `learning_rate`: 2e-05
364
  - `warmup_steps`: 0.1
365
  - `bf16`: True
366
  - `eval_strategy`: epoch
@@ -377,7 +377,7 @@ You can finetune this model on your own dataset.
377
  - `per_device_train_batch_size`: 1024
378
  - `num_train_epochs`: 10
379
  - `max_steps`: -1
380
- - `learning_rate`: 2e-05
381
  - `lr_scheduler_type`: linear
382
  - `lr_scheduler_kwargs`: None
383
  - `warmup_steps`: 0.1
@@ -475,24 +475,24 @@ You can finetune this model on your own dataset.
475
  </details>
476
 
477
  ### Training Logs
478
- | Epoch | Step | Training Loss | Validation Loss | eval_cosine_ndcg@10 |
479
- |:--------:|:------:|:-------------:|:---------------:|:-------------------:|
480
- | 1.0 | 8 | - | 2.7837 | 0.3703 |
481
- | 1.25 | 10 | 6.1885 | - | - |
482
- | 2.0 | 16 | - | 1.4004 | 0.4896 |
483
- | 2.5 | 20 | 3.6826 | - | - |
484
- | 3.0 | 24 | - | 0.8114 | 0.6814 |
485
- | 3.75 | 30 | 2.2134 | - | - |
486
- | 4.0 | 32 | - | 0.5772 | 0.7412 |
487
- | 5.0 | 40 | 1.5999 | 0.4729 | 0.7684 |
488
- | 6.0 | 48 | - | 0.4246 | 0.7873 |
489
- | 6.25 | 50 | 1.3357 | - | - |
490
- | 7.0 | 56 | - | 0.3918 | 0.7978 |
491
- | 7.5 | 60 | 1.1768 | - | - |
492
- | 8.0 | 64 | - | 0.3711 | 0.8005 |
493
- | 8.75 | 70 | 1.0993 | - | - |
494
- | 9.0 | 72 | - | 0.3602 | 0.8064 |
495
- | **10.0** | **80** | **1.0152** | **0.3568** | **0.8076** |
496
 
497
  * The bold row denotes the saved checkpoint.
498
 
@@ -502,7 +502,7 @@ You can finetune this model on your own dataset.
502
  - Transformers: 5.3.0
503
  - PyTorch: 2.10.0+cu128
504
  - Accelerate: 1.13.0
505
- - Datasets: 4.7.0
506
  - Tokenizers: 0.22.2
507
 
508
  ## Citation
 
7
  - generated_from_trainer
8
  - dataset_size:8118
9
  - loss:CachedMultipleNegativesRankingLoss
10
+ base_model: benjamintli/modernbert-cosqa
11
  widget:
12
  - source_sentence: python create path if doesnt exist
13
  sentences:
 
101
  - cosine_mrr@10
102
  - cosine_map@100
103
  model-index:
104
+ - name: SentenceTransformer based on benjamintli/modernbert-cosqa
105
  results:
106
  - task:
107
  type: information-retrieval
 
111
  type: eval
112
  metrics:
113
  - type: cosine_accuracy@1
114
+ value: 0.6197339246119734
115
  name: Cosine Accuracy@1
116
  - type: cosine_accuracy@3
117
+ value: 0.88470066518847
118
  name: Cosine Accuracy@3
119
  - type: cosine_accuracy@5
120
+ value: 0.9390243902439024
121
  name: Cosine Accuracy@5
122
  - type: cosine_accuracy@10
123
+ value: 0.9778270509977827
124
  name: Cosine Accuracy@10
125
  - type: cosine_precision@1
126
+ value: 0.6197339246119734
127
  name: Cosine Precision@1
128
  - type: cosine_precision@3
129
+ value: 0.29490022172949004
130
  name: Cosine Precision@3
131
  - type: cosine_precision@5
132
+ value: 0.18780487804878046
133
  name: Cosine Precision@5
134
  - type: cosine_precision@10
135
+ value: 0.0977827050997783
136
  name: Cosine Precision@10
137
  - type: cosine_recall@1
138
+ value: 0.6197339246119734
139
  name: Cosine Recall@1
140
  - type: cosine_recall@3
141
+ value: 0.88470066518847
142
  name: Cosine Recall@3
143
  - type: cosine_recall@5
144
+ value: 0.9390243902439024
145
  name: Cosine Recall@5
146
  - type: cosine_recall@10
147
+ value: 0.9778270509977827
148
  name: Cosine Recall@10
149
  - type: cosine_ndcg@10
150
+ value: 0.8124675617500997
151
  name: Cosine Ndcg@10
152
  - type: cosine_mrr@10
153
+ value: 0.7577473339668463
154
  name: Cosine Mrr@10
155
  - type: cosine_map@100
156
+ value: 0.7588050805217604
157
  name: Cosine Map@100
158
  ---
159
 
160
+ # SentenceTransformer based on benjamintli/modernbert-cosqa
161
 
162
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [benjamintli/modernbert-cosqa](https://huggingface.co/benjamintli/modernbert-cosqa). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
163
 
164
  ## Model Details
165
 
166
  ### Model Description
167
  - **Model Type:** Sentence Transformer
168
+ - **Base model:** [benjamintli/modernbert-cosqa](https://huggingface.co/benjamintli/modernbert-cosqa) <!-- at revision c85b25617894d583fafad7eb7421b7dc0aab0ad9 -->
169
  - **Maximum Sequence Length:** 512 tokens
170
  - **Output Dimensionality:** 768 dimensions
171
  - **Similarity Function:** Cosine Similarity
 
221
  # Get the similarity scores for the embeddings
222
  similarities = model.similarity(query_embeddings, document_embeddings)
223
  print(similarities)
224
+ # tensor([[ 0.5986, -0.0006, -0.0122]])
225
  ```
226
 
227
  <!--
 
259
 
260
  | Metric | Value |
261
  |:--------------------|:-----------|
262
+ | cosine_accuracy@1 | 0.6197 |
263
+ | cosine_accuracy@3 | 0.8847 |
264
+ | cosine_accuracy@5 | 0.939 |
265
+ | cosine_accuracy@10 | 0.9778 |
266
+ | cosine_precision@1 | 0.6197 |
267
+ | cosine_precision@3 | 0.2949 |
268
+ | cosine_precision@5 | 0.1878 |
269
+ | cosine_precision@10 | 0.0978 |
270
+ | cosine_recall@1 | 0.6197 |
271
+ | cosine_recall@3 | 0.8847 |
272
+ | cosine_recall@5 | 0.939 |
273
+ | cosine_recall@10 | 0.9778 |
274
+ | **cosine_ndcg@10** | **0.8125** |
275
+ | cosine_mrr@10 | 0.7577 |
276
+ | cosine_map@100 | 0.7588 |
277
 
278
  <!--
279
  ## Bias, Risks and Limitations
 
360
 
361
  - `per_device_train_batch_size`: 1024
362
  - `num_train_epochs`: 10
363
+ - `learning_rate`: 2e-06
364
  - `warmup_steps`: 0.1
365
  - `bf16`: True
366
  - `eval_strategy`: epoch
 
377
  - `per_device_train_batch_size`: 1024
378
  - `num_train_epochs`: 10
379
  - `max_steps`: -1
380
+ - `learning_rate`: 2e-06
381
  - `lr_scheduler_type`: linear
382
  - `lr_scheduler_kwargs`: None
383
  - `warmup_steps`: 0.1
 
475
  </details>
476
 
477
  ### Training Logs
478
+ | Epoch | Step | Training Loss | Validation Loss | eval_cosine_ndcg@10 |
479
+ |:-------:|:------:|:-------------:|:---------------:|:-------------------:|
480
+ | 1.0 | 8 | - | 0.3550 | 0.8071 |
481
+ | 1.25 | 10 | 1.0218 | - | - |
482
+ | 2.0 | 16 | - | 0.3508 | 0.8110 |
483
+ | 2.5 | 20 | 0.9890 | - | - |
484
+ | 3.0 | 24 | - | 0.3466 | 0.8131 |
485
+ | 3.75 | 30 | 0.9778 | - | - |
486
+ | 4.0 | 32 | - | 0.3439 | 0.8136 |
487
+ | **5.0** | **40** | **0.9507** | **0.3417** | **0.8148** |
488
+ | 6.0 | 48 | - | 0.3404 | 0.8120 |
489
+ | 6.25 | 50 | 0.9429 | - | - |
490
+ | 7.0 | 56 | - | 0.3387 | 0.8131 |
491
+ | 7.5 | 60 | 0.9267 | - | - |
492
+ | 8.0 | 64 | - | 0.3378 | 0.8127 |
493
+ | 8.75 | 70 | 0.9396 | - | - |
494
+ | 9.0 | 72 | - | 0.3370 | 0.8106 |
495
+ | 10.0 | 80 | 0.9099 | 0.3366 | 0.8125 |
496
 
497
  * The bold row denotes the saved checkpoint.
498
 
 
502
  - Transformers: 5.3.0
503
  - PyTorch: 2.10.0+cu128
504
  - Accelerate: 1.13.0
505
+ - Datasets: 4.8.2
506
  - Tokenizers: 0.22.2
507
 
508
  ## Citation