radoslavralev commited on
Commit
9a504f3
·
verified ·
1 Parent(s): 46438de

Add new SentenceTransformer model

Browse files
Files changed (1) hide show
  1. README.md +90 -92
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  - generated_from_trainer
8
  - dataset_size:90000
9
  - loss:MultipleNegativesRankingLoss
10
- base_model: sentence-transformers/all-MiniLM-L6-v2
11
  widget:
12
  - source_sentence: who is the publisher of the norton anthology american literature
13
  sentences:
@@ -154,7 +154,7 @@ metrics:
154
  - cosine_mrr@10
155
  - cosine_map@100
156
  model-index:
157
- - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
158
  results:
159
  - task:
160
  type: information-retrieval
@@ -164,49 +164,49 @@ model-index:
164
  type: NanoMSMARCO
165
  metrics:
166
  - type: cosine_accuracy@1
167
- value: 0.24
168
  name: Cosine Accuracy@1
169
  - type: cosine_accuracy@3
170
- value: 0.52
171
  name: Cosine Accuracy@3
172
  - type: cosine_accuracy@5
173
  value: 0.64
174
  name: Cosine Accuracy@5
175
  - type: cosine_accuracy@10
176
- value: 0.76
177
  name: Cosine Accuracy@10
178
  - type: cosine_precision@1
179
- value: 0.24
180
  name: Cosine Precision@1
181
  - type: cosine_precision@3
182
- value: 0.1733333333333333
183
  name: Cosine Precision@3
184
  - type: cosine_precision@5
185
  value: 0.128
186
  name: Cosine Precision@5
187
  - type: cosine_precision@10
188
- value: 0.07600000000000001
189
  name: Cosine Precision@10
190
  - type: cosine_recall@1
191
- value: 0.24
192
  name: Cosine Recall@1
193
  - type: cosine_recall@3
194
- value: 0.52
195
  name: Cosine Recall@3
196
  - type: cosine_recall@5
197
  value: 0.64
198
  name: Cosine Recall@5
199
  - type: cosine_recall@10
200
- value: 0.76
201
  name: Cosine Recall@10
202
  - type: cosine_ndcg@10
203
- value: 0.49143458252672184
204
  name: Cosine Ndcg@10
205
  - type: cosine_mrr@10
206
- value: 0.4057380952380952
207
  name: Cosine Mrr@10
208
  - type: cosine_map@100
209
- value: 0.4167225925544634
210
  name: Cosine Map@100
211
  - task:
212
  type: information-retrieval
@@ -216,49 +216,49 @@ model-index:
216
  type: NanoNQ
217
  metrics:
218
  - type: cosine_accuracy@1
219
- value: 0.46
220
  name: Cosine Accuracy@1
221
  - type: cosine_accuracy@3
222
  value: 0.62
223
  name: Cosine Accuracy@3
224
  - type: cosine_accuracy@5
225
- value: 0.66
226
  name: Cosine Accuracy@5
227
  - type: cosine_accuracy@10
228
- value: 0.7
229
  name: Cosine Accuracy@10
230
  - type: cosine_precision@1
231
- value: 0.46
232
  name: Cosine Precision@1
233
  - type: cosine_precision@3
234
- value: 0.20666666666666664
235
  name: Cosine Precision@3
236
  - type: cosine_precision@5
237
- value: 0.136
238
  name: Cosine Precision@5
239
  - type: cosine_precision@10
240
- value: 0.07600000000000001
241
  name: Cosine Precision@10
242
  - type: cosine_recall@1
243
- value: 0.45
244
  name: Cosine Recall@1
245
  - type: cosine_recall@3
246
- value: 0.58
247
  name: Cosine Recall@3
248
  - type: cosine_recall@5
249
- value: 0.63
250
  name: Cosine Recall@5
251
  - type: cosine_recall@10
252
- value: 0.69
253
  name: Cosine Recall@10
254
  - type: cosine_ndcg@10
255
- value: 0.5748041984771892
256
  name: Cosine Ndcg@10
257
  - type: cosine_mrr@10
258
- value: 0.5456666666666667
259
  name: Cosine Mrr@10
260
  - type: cosine_map@100
261
- value: 0.5421576333440519
262
  name: Cosine Map@100
263
  - task:
264
  type: nano-beir
@@ -268,61 +268,61 @@ model-index:
268
  type: NanoBEIR_mean
269
  metrics:
270
  - type: cosine_accuracy@1
271
- value: 0.35
272
  name: Cosine Accuracy@1
273
  - type: cosine_accuracy@3
274
- value: 0.5700000000000001
275
  name: Cosine Accuracy@3
276
  - type: cosine_accuracy@5
277
- value: 0.65
278
  name: Cosine Accuracy@5
279
  - type: cosine_accuracy@10
280
- value: 0.73
281
  name: Cosine Accuracy@10
282
  - type: cosine_precision@1
283
- value: 0.35
284
  name: Cosine Precision@1
285
  - type: cosine_precision@3
286
- value: 0.18999999999999997
287
  name: Cosine Precision@3
288
  - type: cosine_precision@5
289
- value: 0.132
290
  name: Cosine Precision@5
291
  - type: cosine_precision@10
292
- value: 0.07600000000000001
293
  name: Cosine Precision@10
294
  - type: cosine_recall@1
295
- value: 0.345
296
  name: Cosine Recall@1
297
  - type: cosine_recall@3
298
- value: 0.55
299
  name: Cosine Recall@3
300
  - type: cosine_recall@5
301
- value: 0.635
302
  name: Cosine Recall@5
303
  - type: cosine_recall@10
304
- value: 0.725
305
  name: Cosine Recall@10
306
  - type: cosine_ndcg@10
307
- value: 0.5331193905019556
308
  name: Cosine Ndcg@10
309
  - type: cosine_mrr@10
310
- value: 0.47570238095238093
311
  name: Cosine Mrr@10
312
  - type: cosine_map@100
313
- value: 0.47944011294925765
314
  name: Cosine Map@100
315
  ---
316
 
317
- # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
318
 
319
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
320
 
321
  ## Model Details
322
 
323
  ### Model Description
324
  - **Model Type:** Sentence Transformer
325
- - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
326
  - **Maximum Sequence Length:** 128 tokens
327
  - **Output Dimensionality:** 384 dimensions
328
  - **Similarity Function:** Cosine Similarity
@@ -375,9 +375,9 @@ print(embeddings.shape)
375
  # Get the similarity scores for the embeddings
376
  similarities = model.similarity(embeddings, embeddings)
377
  print(similarities)
378
- # tensor([[1.0000, 0.7393, 0.1251],
379
- # [0.7393, 1.0000, 0.1255],
380
- # [0.1251, 0.1255, 1.0000]])
381
  ```
382
 
383
  <!--
@@ -415,21 +415,21 @@ You can finetune this model on your own dataset.
415
 
416
  | Metric | NanoMSMARCO | NanoNQ |
417
  |:--------------------|:------------|:-----------|
418
- | cosine_accuracy@1 | 0.24 | 0.46 |
419
- | cosine_accuracy@3 | 0.52 | 0.62 |
420
- | cosine_accuracy@5 | 0.64 | 0.66 |
421
- | cosine_accuracy@10 | 0.76 | 0.7 |
422
- | cosine_precision@1 | 0.24 | 0.46 |
423
- | cosine_precision@3 | 0.1733 | 0.2067 |
424
- | cosine_precision@5 | 0.128 | 0.136 |
425
- | cosine_precision@10 | 0.076 | 0.076 |
426
- | cosine_recall@1 | 0.24 | 0.45 |
427
- | cosine_recall@3 | 0.52 | 0.58 |
428
- | cosine_recall@5 | 0.64 | 0.63 |
429
- | cosine_recall@10 | 0.76 | 0.69 |
430
- | **cosine_ndcg@10** | **0.4914** | **0.5748** |
431
- | cosine_mrr@10 | 0.4057 | 0.5457 |
432
- | cosine_map@100 | 0.4167 | 0.5422 |
433
 
434
  #### Nano BEIR
435
 
@@ -447,21 +447,21 @@ You can finetune this model on your own dataset.
447
 
448
  | Metric | Value |
449
  |:--------------------|:-----------|
450
- | cosine_accuracy@1 | 0.35 |
451
- | cosine_accuracy@3 | 0.57 |
452
- | cosine_accuracy@5 | 0.65 |
453
- | cosine_accuracy@10 | 0.73 |
454
- | cosine_precision@1 | 0.35 |
455
- | cosine_precision@3 | 0.19 |
456
- | cosine_precision@5 | 0.132 |
457
- | cosine_precision@10 | 0.076 |
458
- | cosine_recall@1 | 0.345 |
459
- | cosine_recall@3 | 0.55 |
460
- | cosine_recall@5 | 0.635 |
461
- | cosine_recall@10 | 0.725 |
462
- | **cosine_ndcg@10** | **0.5331** |
463
- | cosine_mrr@10 | 0.4757 |
464
- | cosine_map@100 | 0.4794 |
465
 
466
  <!--
467
  ## Bias, Risks and Limitations
@@ -535,9 +535,9 @@ You can finetune this model on your own dataset.
535
  - `eval_strategy`: steps
536
  - `per_device_train_batch_size`: 128
537
  - `per_device_eval_batch_size`: 128
538
- - `learning_rate`: 0.0001
539
- - `weight_decay`: 0.001
540
- - `max_steps`: 1687
541
  - `warmup_ratio`: 0.1
542
  - `fp16`: True
543
  - `dataloader_drop_last`: True
@@ -564,14 +564,14 @@ You can finetune this model on your own dataset.
564
  - `gradient_accumulation_steps`: 1
565
  - `eval_accumulation_steps`: None
566
  - `torch_empty_cache_steps`: None
567
- - `learning_rate`: 0.0001
568
- - `weight_decay`: 0.001
569
  - `adam_beta1`: 0.9
570
  - `adam_beta2`: 0.999
571
  - `adam_epsilon`: 1e-08
572
  - `max_grad_norm`: 1.0
573
  - `num_train_epochs`: 3.0
574
- - `max_steps`: 1687
575
  - `lr_scheduler_type`: linear
576
  - `lr_scheduler_kwargs`: {}
577
  - `warmup_ratio`: 0.1
@@ -678,13 +678,11 @@ You can finetune this model on your own dataset.
678
  ### Training Logs
679
  | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
680
  |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
681
- | 0 | 0 | - | 0.0803 | 0.5540 | 0.5931 | 0.5735 |
682
- | 0.3556 | 250 | 0.0897 | 0.0764 | 0.5354 | 0.5945 | 0.5650 |
683
- | 0.7112 | 500 | 0.0917 | 0.0749 | 0.5252 | 0.5638 | 0.5445 |
684
- | 1.0669 | 750 | 0.0823 | 0.0653 | 0.5267 | 0.5904 | 0.5586 |
685
- | 1.4225 | 1000 | 0.0459 | 0.0630 | 0.5236 | 0.5689 | 0.5462 |
686
- | 1.7781 | 1250 | 0.0429 | 0.0613 | 0.5312 | 0.5676 | 0.5494 |
687
- | 2.1337 | 1500 | 0.0393 | 0.0600 | 0.4914 | 0.5748 | 0.5331 |
688
 
689
 
690
  ### Framework Versions
 
7
  - generated_from_trainer
8
  - dataset_size:90000
9
  - loss:MultipleNegativesRankingLoss
10
+ base_model: sentence-transformers/all-MiniLM-L12-v2
11
  widget:
12
  - source_sentence: who is the publisher of the norton anthology american literature
13
  sentences:
 
154
  - cosine_mrr@10
155
  - cosine_map@100
156
  model-index:
157
+ - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L12-v2
158
  results:
159
  - task:
160
  type: information-retrieval
 
164
  type: NanoMSMARCO
165
  metrics:
166
  - type: cosine_accuracy@1
167
+ value: 0.34
168
  name: Cosine Accuracy@1
169
  - type: cosine_accuracy@3
170
+ value: 0.54
171
  name: Cosine Accuracy@3
172
  - type: cosine_accuracy@5
173
  value: 0.64
174
  name: Cosine Accuracy@5
175
  - type: cosine_accuracy@10
176
+ value: 0.78
177
  name: Cosine Accuracy@10
178
  - type: cosine_precision@1
179
+ value: 0.34
180
  name: Cosine Precision@1
181
  - type: cosine_precision@3
182
+ value: 0.18
183
  name: Cosine Precision@3
184
  - type: cosine_precision@5
185
  value: 0.128
186
  name: Cosine Precision@5
187
  - type: cosine_precision@10
188
+ value: 0.07800000000000001
189
  name: Cosine Precision@10
190
  - type: cosine_recall@1
191
+ value: 0.34
192
  name: Cosine Recall@1
193
  - type: cosine_recall@3
194
+ value: 0.54
195
  name: Cosine Recall@3
196
  - type: cosine_recall@5
197
  value: 0.64
198
  name: Cosine Recall@5
199
  - type: cosine_recall@10
200
+ value: 0.78
201
  name: Cosine Recall@10
202
  - type: cosine_ndcg@10
203
+ value: 0.5447080049645561
204
  name: Cosine Ndcg@10
205
  - type: cosine_mrr@10
206
+ value: 0.47073809523809523
207
  name: Cosine Mrr@10
208
  - type: cosine_map@100
209
+ value: 0.4806962957327628
210
  name: Cosine Map@100
211
  - task:
212
  type: information-retrieval
 
216
  type: NanoNQ
217
  metrics:
218
  - type: cosine_accuracy@1
219
+ value: 0.44
220
  name: Cosine Accuracy@1
221
  - type: cosine_accuracy@3
222
  value: 0.62
223
  name: Cosine Accuracy@3
224
  - type: cosine_accuracy@5
225
+ value: 0.7
226
  name: Cosine Accuracy@5
227
  - type: cosine_accuracy@10
228
+ value: 0.78
229
  name: Cosine Accuracy@10
230
  - type: cosine_precision@1
231
+ value: 0.44
232
  name: Cosine Precision@1
233
  - type: cosine_precision@3
234
+ value: 0.21333333333333332
235
  name: Cosine Precision@3
236
  - type: cosine_precision@5
237
+ value: 0.14800000000000002
238
  name: Cosine Precision@5
239
  - type: cosine_precision@10
240
+ value: 0.08199999999999999
241
  name: Cosine Precision@10
242
  - type: cosine_recall@1
243
+ value: 0.43
244
  name: Cosine Recall@1
245
  - type: cosine_recall@3
246
+ value: 0.61
247
  name: Cosine Recall@3
248
  - type: cosine_recall@5
249
+ value: 0.67
250
  name: Cosine Recall@5
251
  - type: cosine_recall@10
252
+ value: 0.74
253
  name: Cosine Recall@10
254
  - type: cosine_ndcg@10
255
+ value: 0.5924173512360595
256
  name: Cosine Ndcg@10
257
  - type: cosine_mrr@10
258
+ value: 0.5506349206349206
259
  name: Cosine Mrr@10
260
  - type: cosine_map@100
261
+ value: 0.5491036387356644
262
  name: Cosine Map@100
263
  - task:
264
  type: nano-beir
 
268
  type: NanoBEIR_mean
269
  metrics:
270
  - type: cosine_accuracy@1
271
+ value: 0.39
272
  name: Cosine Accuracy@1
273
  - type: cosine_accuracy@3
274
+ value: 0.5800000000000001
275
  name: Cosine Accuracy@3
276
  - type: cosine_accuracy@5
277
+ value: 0.6699999999999999
278
  name: Cosine Accuracy@5
279
  - type: cosine_accuracy@10
280
+ value: 0.78
281
  name: Cosine Accuracy@10
282
  - type: cosine_precision@1
283
+ value: 0.39
284
  name: Cosine Precision@1
285
  - type: cosine_precision@3
286
+ value: 0.19666666666666666
287
  name: Cosine Precision@3
288
  - type: cosine_precision@5
289
+ value: 0.138
290
  name: Cosine Precision@5
291
  - type: cosine_precision@10
292
+ value: 0.08
293
  name: Cosine Precision@10
294
  - type: cosine_recall@1
295
+ value: 0.385
296
  name: Cosine Recall@1
297
  - type: cosine_recall@3
298
+ value: 0.575
299
  name: Cosine Recall@3
300
  - type: cosine_recall@5
301
+ value: 0.655
302
  name: Cosine Recall@5
303
  - type: cosine_recall@10
304
+ value: 0.76
305
  name: Cosine Recall@10
306
  - type: cosine_ndcg@10
307
+ value: 0.5685626781003078
308
  name: Cosine Ndcg@10
309
  - type: cosine_mrr@10
310
+ value: 0.5106865079365079
311
  name: Cosine Mrr@10
312
  - type: cosine_map@100
313
+ value: 0.5148999672342136
314
  name: Cosine Map@100
315
  ---
316
 
317
+ # SentenceTransformer based on sentence-transformers/all-MiniLM-L12-v2
318
 
319
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
320
 
321
  ## Model Details
322
 
323
  ### Model Description
324
  - **Model Type:** Sentence Transformer
325
+ - **Base model:** [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2) <!-- at revision 936af83a2ecce5fe87a09109ff5cbcefe073173a -->
326
  - **Maximum Sequence Length:** 128 tokens
327
  - **Output Dimensionality:** 384 dimensions
328
  - **Similarity Function:** Cosine Similarity
 
375
  # Get the similarity scores for the embeddings
376
  similarities = model.similarity(embeddings, embeddings)
377
  print(similarities)
378
+ # tensor([[ 1.0000, 0.7187, -0.0053],
379
+ # [ 0.7187, 1.0000, 0.0412],
380
+ # [-0.0053, 0.0412, 1.0000]])
381
  ```
382
 
383
  <!--
 
415
 
416
  | Metric | NanoMSMARCO | NanoNQ |
417
  |:--------------------|:------------|:-----------|
418
+ | cosine_accuracy@1 | 0.34 | 0.44 |
419
+ | cosine_accuracy@3 | 0.54 | 0.62 |
420
+ | cosine_accuracy@5 | 0.64 | 0.7 |
421
+ | cosine_accuracy@10 | 0.78 | 0.78 |
422
+ | cosine_precision@1 | 0.34 | 0.44 |
423
+ | cosine_precision@3 | 0.18 | 0.2133 |
424
+ | cosine_precision@5 | 0.128 | 0.148 |
425
+ | cosine_precision@10 | 0.078 | 0.082 |
426
+ | cosine_recall@1 | 0.34 | 0.43 |
427
+ | cosine_recall@3 | 0.54 | 0.61 |
428
+ | cosine_recall@5 | 0.64 | 0.67 |
429
+ | cosine_recall@10 | 0.78 | 0.74 |
430
+ | **cosine_ndcg@10** | **0.5447** | **0.5924** |
431
+ | cosine_mrr@10 | 0.4707 | 0.5506 |
432
+ | cosine_map@100 | 0.4807 | 0.5491 |
433
 
434
  #### Nano BEIR
435
 
 
447
 
448
  | Metric | Value |
449
  |:--------------------|:-----------|
450
+ | cosine_accuracy@1 | 0.39 |
451
+ | cosine_accuracy@3 | 0.58 |
452
+ | cosine_accuracy@5 | 0.67 |
453
+ | cosine_accuracy@10 | 0.78 |
454
+ | cosine_precision@1 | 0.39 |
455
+ | cosine_precision@3 | 0.1967 |
456
+ | cosine_precision@5 | 0.138 |
457
+ | cosine_precision@10 | 0.08 |
458
+ | cosine_recall@1 | 0.385 |
459
+ | cosine_recall@3 | 0.575 |
460
+ | cosine_recall@5 | 0.655 |
461
+ | cosine_recall@10 | 0.76 |
462
+ | **cosine_ndcg@10** | **0.5686** |
463
+ | cosine_mrr@10 | 0.5107 |
464
+ | cosine_map@100 | 0.5149 |
465
 
466
  <!--
467
  ## Bias, Risks and Limitations
 
535
  - `eval_strategy`: steps
536
  - `per_device_train_batch_size`: 128
537
  - `per_device_eval_batch_size`: 128
538
+ - `learning_rate`: 8e-05
539
+ - `weight_decay`: 0.005
540
+ - `max_steps`: 1125
541
  - `warmup_ratio`: 0.1
542
  - `fp16`: True
543
  - `dataloader_drop_last`: True
 
564
  - `gradient_accumulation_steps`: 1
565
  - `eval_accumulation_steps`: None
566
  - `torch_empty_cache_steps`: None
567
+ - `learning_rate`: 8e-05
568
+ - `weight_decay`: 0.005
569
  - `adam_beta1`: 0.9
570
  - `adam_beta2`: 0.999
571
  - `adam_epsilon`: 1e-08
572
  - `max_grad_norm`: 1.0
573
  - `num_train_epochs`: 3.0
574
+ - `max_steps`: 1125
575
  - `lr_scheduler_type`: linear
576
  - `lr_scheduler_kwargs`: {}
577
  - `warmup_ratio`: 0.1
 
678
  ### Training Logs
679
  | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
680
  |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
681
+ | 0 | 0 | - | 0.0731 | 0.5887 | 0.5786 | 0.5836 |
682
+ | 0.3556 | 250 | 0.0821 | 0.0701 | 0.5325 | 0.5977 | 0.5651 |
683
+ | 0.7112 | 500 | 0.0805 | 0.0640 | 0.5523 | 0.5631 | 0.5577 |
684
+ | 1.0669 | 750 | 0.0712 | 0.0572 | 0.5369 | 0.5819 | 0.5594 |
685
+ | 1.4225 | 1000 | 0.0371 | 0.0551 | 0.5447 | 0.5924 | 0.5686 |
 
 
686
 
687
 
688
  ### Framework Versions