radoslavralev commited on
Commit
99e7ada
·
verified ·
1 Parent(s): 0b0ba53

Add new SentenceTransformer model

Browse files
Files changed (1) hide show
  1. README.md +135 -146
README.md CHANGED
@@ -5,42 +5,39 @@ tags:
5
  - feature-extraction
6
  - dense
7
  - generated_from_trainer
8
- - dataset_size:359997
9
  - loss:MultipleNegativesRankingLoss
10
  base_model: sentence-transformers/all-MiniLM-L6-v2
11
  widget:
12
- - source_sentence: When do you use Ms. or Mrs.? Is one for a married woman and one
13
- for one that's not married? Which one is for what?
14
  sentences:
15
- - When do you use Ms. or Mrs.? Is one for a married woman and one for one that's
16
- not married? Which one is for what?
17
- - Nations that do/does otherwise? Which one do I use?
18
- - Why don't bikes have a gear indicator?
19
- - source_sentence: Which ointment is applied to the face of UFC fighters at the commencement
20
- of a bout? What does it do?
21
  sentences:
22
- - How can I save a Snapchat video that others posted?
23
- - Which ointment is applied to the face of UFC fighters at the commencement of a
24
- bout? What does it do?
25
- - How do I get the body of a UFC Fighter?
26
- - source_sentence: Do you love the life you live?
 
27
  sentences:
28
- - Can I do shoulder and triceps workout on same day? What other combinations like
29
- this can I do?
30
- - Do you love the life you're living?
31
- - Where can you find an online TI-84 calculator?
32
- - source_sentence: Ordered food on Swiggy 3 days ago.After accepting my money, said
33
- no more on Menu! When if ever will I atleast get refund in cr card a/c?
34
  sentences:
35
- - Is getting to the Tel Aviv airport to catch a 5:30 AM flight very expensive?
36
- - How do I die and make it look like an accident?
37
- - Ordered food on Swiggy 3 days ago.After accepting my money, said no more on Menu!
38
- When if ever will I atleast get refund in cr card a/c?
39
- - source_sentence: How do you earn money on Quora?
40
  sentences:
41
- - What is a cheap healthy diet I can keep the same and eat every day?
42
- - What are some things new employees should know going into their first day at Maximus?
43
- - What is the best way to make money on Quora?
44
  pipeline_tag: sentence-similarity
45
  library_name: sentence-transformers
46
  metrics:
@@ -70,49 +67,49 @@ model-index:
70
  type: NanoMSMARCO
71
  metrics:
72
  - type: cosine_accuracy@1
73
- value: 0.22
74
  name: Cosine Accuracy@1
75
  - type: cosine_accuracy@3
76
  value: 0.5
77
  name: Cosine Accuracy@3
78
  - type: cosine_accuracy@5
79
- value: 0.62
80
  name: Cosine Accuracy@5
81
  - type: cosine_accuracy@10
82
  value: 0.74
83
  name: Cosine Accuracy@10
84
  - type: cosine_precision@1
85
- value: 0.22
86
  name: Cosine Precision@1
87
  - type: cosine_precision@3
88
- value: 0.16666666666666663
89
  name: Cosine Precision@3
90
  - type: cosine_precision@5
91
- value: 0.124
92
  name: Cosine Precision@5
93
  - type: cosine_precision@10
94
  value: 0.07400000000000001
95
  name: Cosine Precision@10
96
  - type: cosine_recall@1
97
- value: 0.22
98
  name: Cosine Recall@1
99
  - type: cosine_recall@3
100
  value: 0.5
101
  name: Cosine Recall@3
102
  - type: cosine_recall@5
103
- value: 0.62
104
  name: Cosine Recall@5
105
  - type: cosine_recall@10
106
  value: 0.74
107
  name: Cosine Recall@10
108
  - type: cosine_ndcg@10
109
- value: 0.47667177266958005
110
  name: Cosine Ndcg@10
111
  - type: cosine_mrr@10
112
- value: 0.39240476190476187
113
  name: Cosine Mrr@10
114
  - type: cosine_map@100
115
- value: 0.406991563991564
116
  name: Cosine Map@100
117
  - task:
118
  type: information-retrieval
@@ -122,49 +119,49 @@ model-index:
122
  type: NanoNQ
123
  metrics:
124
  - type: cosine_accuracy@1
125
- value: 0.28
126
  name: Cosine Accuracy@1
127
  - type: cosine_accuracy@3
128
- value: 0.46
129
  name: Cosine Accuracy@3
130
  - type: cosine_accuracy@5
131
- value: 0.56
132
  name: Cosine Accuracy@5
133
  - type: cosine_accuracy@10
134
- value: 0.64
135
  name: Cosine Accuracy@10
136
  - type: cosine_precision@1
137
- value: 0.28
138
  name: Cosine Precision@1
139
  - type: cosine_precision@3
140
- value: 0.15999999999999998
141
  name: Cosine Precision@3
142
  - type: cosine_precision@5
143
- value: 0.11600000000000002
144
  name: Cosine Precision@5
145
  - type: cosine_precision@10
146
- value: 0.066
147
  name: Cosine Precision@10
148
  - type: cosine_recall@1
149
- value: 0.27
150
  name: Cosine Recall@1
151
  - type: cosine_recall@3
152
- value: 0.45
153
  name: Cosine Recall@3
154
  - type: cosine_recall@5
155
- value: 0.54
156
  name: Cosine Recall@5
157
  - type: cosine_recall@10
158
- value: 0.61
159
  name: Cosine Recall@10
160
  - type: cosine_ndcg@10
161
- value: 0.4442430372694745
162
  name: Cosine Ndcg@10
163
  - type: cosine_mrr@10
164
- value: 0.39785714285714285
165
  name: Cosine Mrr@10
166
  - type: cosine_map@100
167
- value: 0.39869586832265574
168
  name: Cosine Map@100
169
  - task:
170
  type: nano-beir
@@ -174,49 +171,49 @@ model-index:
174
  type: NanoBEIR_mean
175
  metrics:
176
  - type: cosine_accuracy@1
177
- value: 0.25
178
  name: Cosine Accuracy@1
179
  - type: cosine_accuracy@3
180
- value: 0.48
181
  name: Cosine Accuracy@3
182
  - type: cosine_accuracy@5
183
- value: 0.5900000000000001
184
  name: Cosine Accuracy@5
185
  - type: cosine_accuracy@10
186
- value: 0.69
187
  name: Cosine Accuracy@10
188
  - type: cosine_precision@1
189
- value: 0.25
190
  name: Cosine Precision@1
191
  - type: cosine_precision@3
192
- value: 0.1633333333333333
193
  name: Cosine Precision@3
194
  - type: cosine_precision@5
195
- value: 0.12000000000000001
196
  name: Cosine Precision@5
197
  - type: cosine_precision@10
198
- value: 0.07
199
  name: Cosine Precision@10
200
  - type: cosine_recall@1
201
- value: 0.245
202
  name: Cosine Recall@1
203
  - type: cosine_recall@3
204
- value: 0.475
205
  name: Cosine Recall@3
206
  - type: cosine_recall@5
207
- value: 0.5800000000000001
208
  name: Cosine Recall@5
209
  - type: cosine_recall@10
210
- value: 0.675
211
  name: Cosine Recall@10
212
  - type: cosine_ndcg@10
213
- value: 0.46045740496952725
214
  name: Cosine Ndcg@10
215
  - type: cosine_mrr@10
216
- value: 0.39513095238095236
217
  name: Cosine Mrr@10
218
  - type: cosine_map@100
219
- value: 0.4028437161571099
220
  name: Cosine Map@100
221
  ---
222
 
@@ -270,9 +267,9 @@ from sentence_transformers import SentenceTransformer
270
  model = SentenceTransformer("redis/model-a-baseline")
271
  # Run inference
272
  sentences = [
273
- 'How do you earn money on Quora?',
274
- 'What is the best way to make money on Quora?',
275
- 'What are some things new employees should know going into their first day at Maximus?',
276
  ]
277
  embeddings = model.encode(sentences)
278
  print(embeddings.shape)
@@ -281,9 +278,9 @@ print(embeddings.shape)
281
  # Get the similarity scores for the embeddings
282
  similarities = model.similarity(embeddings, embeddings)
283
  print(similarities)
284
- # tensor([[1.0000, 0.9894, 0.0074],
285
- # [0.9894, 1.0000, 0.0136],
286
- # [0.0074, 0.0136, 1.0000]])
287
  ```
288
 
289
  <!--
@@ -319,23 +316,23 @@ You can finetune this model on your own dataset.
319
  * Datasets: `NanoMSMARCO` and `NanoNQ`
320
  * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
321
 
322
- | Metric | NanoMSMARCO | NanoNQ |
323
- |:--------------------|:------------|:-----------|
324
- | cosine_accuracy@1 | 0.22 | 0.28 |
325
- | cosine_accuracy@3 | 0.5 | 0.46 |
326
- | cosine_accuracy@5 | 0.62 | 0.56 |
327
- | cosine_accuracy@10 | 0.74 | 0.64 |
328
- | cosine_precision@1 | 0.22 | 0.28 |
329
- | cosine_precision@3 | 0.1667 | 0.16 |
330
- | cosine_precision@5 | 0.124 | 0.116 |
331
- | cosine_precision@10 | 0.074 | 0.066 |
332
- | cosine_recall@1 | 0.22 | 0.27 |
333
- | cosine_recall@3 | 0.5 | 0.45 |
334
- | cosine_recall@5 | 0.62 | 0.54 |
335
- | cosine_recall@10 | 0.74 | 0.61 |
336
- | **cosine_ndcg@10** | **0.4767** | **0.4442** |
337
- | cosine_mrr@10 | 0.3924 | 0.3979 |
338
- | cosine_map@100 | 0.407 | 0.3987 |
339
 
340
  #### Nano BEIR
341
 
@@ -353,21 +350,21 @@ You can finetune this model on your own dataset.
353
 
354
  | Metric | Value |
355
  |:--------------------|:-----------|
356
- | cosine_accuracy@1 | 0.25 |
357
- | cosine_accuracy@3 | 0.48 |
358
- | cosine_accuracy@5 | 0.59 |
359
- | cosine_accuracy@10 | 0.69 |
360
- | cosine_precision@1 | 0.25 |
361
- | cosine_precision@3 | 0.1633 |
362
- | cosine_precision@5 | 0.12 |
363
- | cosine_precision@10 | 0.07 |
364
- | cosine_recall@1 | 0.245 |
365
- | cosine_recall@3 | 0.475 |
366
- | cosine_recall@5 | 0.58 |
367
- | cosine_recall@10 | 0.675 |
368
- | **cosine_ndcg@10** | **0.4605** |
369
- | cosine_mrr@10 | 0.3951 |
370
- | cosine_map@100 | 0.4028 |
371
 
372
  <!--
373
  ## Bias, Risks and Limitations
@@ -387,19 +384,19 @@ You can finetune this model on your own dataset.
387
 
388
  #### Unnamed Dataset
389
 
390
- * Size: 359,997 training samples
391
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
392
  * Approximate statistics based on the first 1000 samples:
393
- | | anchor | positive | negative |
394
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
395
- | type | string | string | string |
396
- | details | <ul><li>min: 4 tokens</li><li>mean: 15.46 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 15.52 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.99 tokens</li><li>max: 128 tokens</li></ul> |
397
  * Samples:
398
- | anchor | positive | negative |
399
- |:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|
400
- | <code>Shall I upgrade my iPhone 5s to iOS 10 final version?</code> | <code>Should I upgrade an iPhone 5s to iOS 10?</code> | <code>Whether extension of CA-articleship is to be served at same firm/company?</code> |
401
- | <code>Is Donald Trump really going to be the president of United States?</code> | <code>Do you think Donald Trump could conceivably be the next President of the United States?</code> | <code>Since solid carbon dioxide is dry ice and incredibly cold, why doesn't it have an effect on global warming?</code> |
402
- | <code>What are real tips to improve work life balance?</code> | <code>What are the best ways to create a work life balance?</code> | <code>How do you open a briefcase combination lock without the combination?</code> |
403
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
404
  ```json
405
  {
@@ -413,19 +410,19 @@ You can finetune this model on your own dataset.
413
 
414
  #### Unnamed Dataset
415
 
416
- * Size: 40,000 evaluation samples
417
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
418
  * Approximate statistics based on the first 1000 samples:
419
  | | anchor | positive | negative |
420
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
421
  | type | string | string | string |
422
- | details | <ul><li>min: 6 tokens</li><li>mean: 15.71 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.79 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 16.97 tokens</li><li>max: 78 tokens</li></ul> |
423
  * Samples:
424
- | anchor | positive | negative |
425
- |:------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------|
426
- | <code>Why were feathered dinosaur fossils only found in the last 20 years?</code> | <code>Why were feathered dinosaur fossils only found in the last 20 years?</code> | <code>Why are only few people aware that many dinosaurs had feathers?</code> |
427
- | <code>If FOX News is the conservative news station, which cable news network is for liberals/progressives?</code> | <code>If FOX News is the conservative news station, which cable news network is for liberals/progressives?</code> | <code>How much did Fox News and conservative leaning media networks stoke the anger that contributed to Donald Trump's popularity?</code> |
428
- | <code>How can guys last longer during sex?</code> | <code>How do I last longer in sex?</code> | <code>What is a permanent solution for rough and puffy hair?</code> |
429
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
430
  ```json
431
  {
@@ -443,7 +440,7 @@ You can finetune this model on your own dataset.
443
  - `per_device_eval_batch_size`: 128
444
  - `learning_rate`: 2e-05
445
  - `weight_decay`: 0.0001
446
- - `max_steps`: 5000
447
  - `warmup_ratio`: 0.1
448
  - `fp16`: True
449
  - `dataloader_drop_last`: True
@@ -477,7 +474,7 @@ You can finetune this model on your own dataset.
477
  - `adam_epsilon`: 1e-08
478
  - `max_grad_norm`: 1.0
479
  - `num_train_epochs`: 3.0
480
- - `max_steps`: 5000
481
  - `lr_scheduler_type`: linear
482
  - `lr_scheduler_kwargs`: {}
483
  - `warmup_ratio`: 0.1
@@ -584,27 +581,19 @@ You can finetune this model on your own dataset.
584
  ### Training Logs
585
  | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
586
  |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
587
- | 0 | 0 | - | 0.5501 | 0.5540 | 0.5931 | 0.5735 |
588
- | 0.0889 | 250 | 0.6218 | 0.4360 | 0.5499 | 0.5725 | 0.5612 |
589
- | 0.1778 | 500 | 0.557 | 0.4231 | 0.5414 | 0.5239 | 0.5326 |
590
- | 0.2667 | 750 | 0.5359 | 0.4146 | 0.5188 | 0.5189 | 0.5188 |
591
- | 0.3556 | 1000 | 0.5213 | 0.4095 | 0.4998 | 0.5138 | 0.5068 |
592
- | 0.4445 | 1250 | 0.51 | 0.4058 | 0.5021 | 0.4988 | 0.5005 |
593
- | 0.5334 | 1500 | 0.5086 | 0.4030 | 0.5040 | 0.4970 | 0.5005 |
594
- | 0.6223 | 1750 | 0.5031 | 0.4002 | 0.4963 | 0.4997 | 0.4980 |
595
- | 0.7112 | 2000 | 0.4964 | 0.3979 | 0.5033 | 0.4880 | 0.4956 |
596
- | 0.8001 | 2250 | 0.4927 | 0.3960 | 0.5077 | 0.4881 | 0.4979 |
597
- | 0.8890 | 2500 | 0.4925 | 0.3946 | 0.4939 | 0.4826 | 0.4882 |
598
- | 0.9780 | 2750 | 0.4889 | 0.3936 | 0.4953 | 0.4778 | 0.4865 |
599
- | 1.0669 | 3000 | 0.4819 | 0.3917 | 0.4838 | 0.4723 | 0.4781 |
600
- | 1.1558 | 3250 | 0.4798 | 0.3910 | 0.4900 | 0.4587 | 0.4743 |
601
- | 1.2447 | 3500 | 0.4773 | 0.3905 | 0.4888 | 0.4557 | 0.4723 |
602
- | 1.3336 | 3750 | 0.476 | 0.3899 | 0.4782 | 0.4512 | 0.4647 |
603
- | 1.4225 | 4000 | 0.4738 | 0.3891 | 0.4873 | 0.4508 | 0.4691 |
604
- | 1.5114 | 4250 | 0.4727 | 0.3887 | 0.4849 | 0.4464 | 0.4657 |
605
- | 1.6003 | 4500 | 0.4737 | 0.3887 | 0.4772 | 0.4482 | 0.4627 |
606
- | 1.6892 | 4750 | 0.4722 | 0.3884 | 0.4810 | 0.4432 | 0.4621 |
607
- | 1.7781 | 5000 | 0.4739 | 0.3883 | 0.4767 | 0.4442 | 0.4605 |
608
 
609
 
610
  ### Framework Versions
 
5
  - feature-extraction
6
  - dense
7
  - generated_from_trainer
8
+ - dataset_size:89998
9
  - loss:MultipleNegativesRankingLoss
10
  base_model: sentence-transformers/all-MiniLM-L6-v2
11
  widget:
12
+ - source_sentence: Indian university which follow" international education "type system?
 
13
  sentences:
14
+ - Indian university which follow" international education "type system?
15
+ - Why should we learn to play the violin?
16
+ - How can you best describe the Boston tea party?
17
+ - source_sentence: Why is it that when I write I sound like a genius, but when I have
18
+ to speak I sound stupid?
 
19
  sentences:
20
+ - Why is it that when I write I sound like a genius, but when I have to speak I
21
+ sound stupid?
22
+ - I want to send a happy birthday message to the man I love, but I don't want to
23
+ sound obsessed (we are not together). What should I write?
24
+ - Is "I really appreciate your time" correct or not?
25
+ - source_sentence: Looking dropshipper for Matcha tea?
26
  sentences:
27
+ - Why have European microstates managed to be independent (without being annexed)
28
+ in a long European history which saw lots of changing territories?
29
+ - What is the best way to decide what career to follow?
30
+ - Looking dropshipper for Matcha tea?
31
+ - source_sentence: What is the difference between Nordic and cross country skiing?
 
32
  sentences:
33
+ - What's the difference between Nordic and Classic cross-country skiing?
34
+ - 'Golf: How do I avoid topping the ball while driving?'
35
+ - What is the best TV series for learning English?
36
+ - source_sentence: Why do onions make people cry?
 
37
  sentences:
38
+ - Why do onions sting?
39
+ - What is manual transmission slipping?
40
+ - Can people with bipolar have healthy relationships?
41
  pipeline_tag: sentence-similarity
42
  library_name: sentence-transformers
43
  metrics:
 
67
  type: NanoMSMARCO
68
  metrics:
69
  - type: cosine_accuracy@1
70
+ value: 0.26
71
  name: Cosine Accuracy@1
72
  - type: cosine_accuracy@3
73
  value: 0.5
74
  name: Cosine Accuracy@3
75
  - type: cosine_accuracy@5
76
+ value: 0.6
77
  name: Cosine Accuracy@5
78
  - type: cosine_accuracy@10
79
  value: 0.74
80
  name: Cosine Accuracy@10
81
  - type: cosine_precision@1
82
+ value: 0.26
83
  name: Cosine Precision@1
84
  - type: cosine_precision@3
85
+ value: 0.16666666666666669
86
  name: Cosine Precision@3
87
  - type: cosine_precision@5
88
+ value: 0.12
89
  name: Cosine Precision@5
90
  - type: cosine_precision@10
91
  value: 0.07400000000000001
92
  name: Cosine Precision@10
93
  - type: cosine_recall@1
94
+ value: 0.26
95
  name: Cosine Recall@1
96
  - type: cosine_recall@3
97
  value: 0.5
98
  name: Cosine Recall@3
99
  - type: cosine_recall@5
100
+ value: 0.6
101
  name: Cosine Recall@5
102
  - type: cosine_recall@10
103
  value: 0.74
104
  name: Cosine Recall@10
105
  - type: cosine_ndcg@10
106
+ value: 0.48774998633566824
107
  name: Cosine Ndcg@10
108
  - type: cosine_mrr@10
109
+ value: 0.4093333333333333
110
  name: Cosine Mrr@10
111
  - type: cosine_map@100
112
+ value: 0.4245357678657921
113
  name: Cosine Map@100
114
  - task:
115
  type: information-retrieval
 
119
  type: NanoNQ
120
  metrics:
121
  - type: cosine_accuracy@1
122
+ value: 0.34
123
  name: Cosine Accuracy@1
124
  - type: cosine_accuracy@3
125
+ value: 0.48
126
  name: Cosine Accuracy@3
127
  - type: cosine_accuracy@5
128
+ value: 0.6
129
  name: Cosine Accuracy@5
130
  - type: cosine_accuracy@10
131
+ value: 0.68
132
  name: Cosine Accuracy@10
133
  - type: cosine_precision@1
134
+ value: 0.34
135
  name: Cosine Precision@1
136
  - type: cosine_precision@3
137
+ value: 0.16666666666666663
138
  name: Cosine Precision@3
139
  - type: cosine_precision@5
140
+ value: 0.124
141
  name: Cosine Precision@5
142
  - type: cosine_precision@10
143
+ value: 0.07400000000000001
144
  name: Cosine Precision@10
145
  - type: cosine_recall@1
146
+ value: 0.32
147
  name: Cosine Recall@1
148
  - type: cosine_recall@3
149
+ value: 0.46
150
  name: Cosine Recall@3
151
  - type: cosine_recall@5
152
+ value: 0.57
153
  name: Cosine Recall@5
154
  - type: cosine_recall@10
155
+ value: 0.67
156
  name: Cosine Recall@10
157
  - type: cosine_ndcg@10
158
+ value: 0.4959822522649102
159
  name: Cosine Ndcg@10
160
  - type: cosine_mrr@10
161
+ value: 0.447095238095238
162
  name: Cosine Mrr@10
163
  - type: cosine_map@100
164
+ value: 0.4450391558194697
165
  name: Cosine Map@100
166
  - task:
167
  type: nano-beir
 
171
  type: NanoBEIR_mean
172
  metrics:
173
  - type: cosine_accuracy@1
174
+ value: 0.30000000000000004
175
  name: Cosine Accuracy@1
176
  - type: cosine_accuracy@3
177
+ value: 0.49
178
  name: Cosine Accuracy@3
179
  - type: cosine_accuracy@5
180
+ value: 0.6
181
  name: Cosine Accuracy@5
182
  - type: cosine_accuracy@10
183
+ value: 0.71
184
  name: Cosine Accuracy@10
185
  - type: cosine_precision@1
186
+ value: 0.30000000000000004
187
  name: Cosine Precision@1
188
  - type: cosine_precision@3
189
+ value: 0.16666666666666666
190
  name: Cosine Precision@3
191
  - type: cosine_precision@5
192
+ value: 0.122
193
  name: Cosine Precision@5
194
  - type: cosine_precision@10
195
+ value: 0.07400000000000001
196
  name: Cosine Precision@10
197
  - type: cosine_recall@1
198
+ value: 0.29000000000000004
199
  name: Cosine Recall@1
200
  - type: cosine_recall@3
201
+ value: 0.48
202
  name: Cosine Recall@3
203
  - type: cosine_recall@5
204
+ value: 0.585
205
  name: Cosine Recall@5
206
  - type: cosine_recall@10
207
+ value: 0.7050000000000001
208
  name: Cosine Recall@10
209
  - type: cosine_ndcg@10
210
+ value: 0.49186611930028923
211
  name: Cosine Ndcg@10
212
  - type: cosine_mrr@10
213
+ value: 0.42821428571428566
214
  name: Cosine Mrr@10
215
  - type: cosine_map@100
216
+ value: 0.4347874618426309
217
  name: Cosine Map@100
218
  ---
219
 
 
267
  model = SentenceTransformer("redis/model-a-baseline")
268
  # Run inference
269
  sentences = [
270
+ 'Why do onions make people cry?',
271
+ 'Why do onions sting?',
272
+ 'Can people with bipolar have healthy relationships?',
273
  ]
274
  embeddings = model.encode(sentences)
275
  print(embeddings.shape)
 
278
  # Get the similarity scores for the embeddings
279
  similarities = model.similarity(embeddings, embeddings)
280
  print(similarities)
281
+ # tensor([[ 1.0000, 0.7596, 0.0004],
282
+ # [ 0.7596, 1.0000, -0.0846],
283
+ # [ 0.0004, -0.0846, 1.0000]])
284
  ```
285
 
286
  <!--
 
316
  * Datasets: `NanoMSMARCO` and `NanoNQ`
317
  * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
318
 
319
+ | Metric | NanoMSMARCO | NanoNQ |
320
+ |:--------------------|:------------|:----------|
321
+ | cosine_accuracy@1 | 0.26 | 0.34 |
322
+ | cosine_accuracy@3 | 0.5 | 0.48 |
323
+ | cosine_accuracy@5 | 0.6 | 0.6 |
324
+ | cosine_accuracy@10 | 0.74 | 0.68 |
325
+ | cosine_precision@1 | 0.26 | 0.34 |
326
+ | cosine_precision@3 | 0.1667 | 0.1667 |
327
+ | cosine_precision@5 | 0.12 | 0.124 |
328
+ | cosine_precision@10 | 0.074 | 0.074 |
329
+ | cosine_recall@1 | 0.26 | 0.32 |
330
+ | cosine_recall@3 | 0.5 | 0.46 |
331
+ | cosine_recall@5 | 0.6 | 0.57 |
332
+ | cosine_recall@10 | 0.74 | 0.67 |
333
+ | **cosine_ndcg@10** | **0.4877** | **0.496** |
334
+ | cosine_mrr@10 | 0.4093 | 0.4471 |
335
+ | cosine_map@100 | 0.4245 | 0.445 |
336
 
337
  #### Nano BEIR
338
 
 
350
 
351
  | Metric | Value |
352
  |:--------------------|:-----------|
353
+ | cosine_accuracy@1 | 0.3 |
354
+ | cosine_accuracy@3 | 0.49 |
355
+ | cosine_accuracy@5 | 0.6 |
356
+ | cosine_accuracy@10 | 0.71 |
357
+ | cosine_precision@1 | 0.3 |
358
+ | cosine_precision@3 | 0.1667 |
359
+ | cosine_precision@5 | 0.122 |
360
+ | cosine_precision@10 | 0.074 |
361
+ | cosine_recall@1 | 0.29 |
362
+ | cosine_recall@3 | 0.48 |
363
+ | cosine_recall@5 | 0.585 |
364
+ | cosine_recall@10 | 0.705 |
365
+ | **cosine_ndcg@10** | **0.4919** |
366
+ | cosine_mrr@10 | 0.4282 |
367
+ | cosine_map@100 | 0.4348 |
368
 
369
  <!--
370
  ## Bias, Risks and Limitations
 
384
 
385
  #### Unnamed Dataset
386
 
387
+ * Size: 89,998 training samples
388
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
389
  * Approximate statistics based on the first 1000 samples:
390
+ | | anchor | positive | negative |
391
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
392
+ | type | string | string | string |
393
+ | details | <ul><li>min: 5 tokens</li><li>mean: 15.61 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 15.72 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 16.55 tokens</li><li>max: 67 tokens</li></ul> |
394
  * Samples:
395
+ | anchor | positive | negative |
396
+ |:-----------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------|
397
+ | <code>How long did it take to develop Pokémon GO?</code> | <code>How long did it take to develop Pokémon GO?</code> | <code>Can I take more than one gym in Pokémon GO?</code> |
398
+ | <code>What is the best gift you've received?</code> | <code>What is the best tangible gift you've ever received?</code> | <code>Where can I download Chaayam Poosiya Veedu (The Painted House) malayalam movie for free?</code> |
399
+ | <code>Why should I bother writing/editing a Wikipedia article when it can be overwritten by anyone?</code> | <code>Why should I bother writing/editing a Wikipedia article when it can be overwritten by anyone?</code> | <code>When I write a chapter, after I finish editing it, it is way too short. How can I lengthen a chapter?</code> |
400
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
401
  ```json
402
  {
 
410
 
411
  #### Unnamed Dataset
412
 
413
+ * Size: 10,000 evaluation samples
414
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
415
  * Approximate statistics based on the first 1000 samples:
416
  | | anchor | positive | negative |
417
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
418
  | type | string | string | string |
419
+ | details | <ul><li>min: 3 tokens</li><li>mean: 15.75 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 15.86 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.66 tokens</li><li>max: 74 tokens</li></ul> |
420
  * Samples:
421
+ | anchor | positive | negative |
422
+ |:--------------------------------------------------------------------------|:--------------------------------------------------------------------------|:--------------------------------------------------------------------|
423
+ | <code>What's it like working in IT for Goldman Sachs?</code> | <code>What's it like working in IT for Goldman Sachs?</code> | <code>What is the work done at Goldman Sachs?</code> |
424
+ | <code>How did Revan build his foundation of his army in Star Wars?</code> | <code>How did Revan build his foundation of his army in Star Wars?</code> | <code>What Star Wars character deserves his/her own movie?</code> |
425
+ | <code>Is C++ the best programming language to learn first?</code> | <code>Is C++ the best programming language to learn first?</code> | <code>Which programming language is the best to learn first?</code> |
426
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
427
  ```json
428
  {
 
440
  - `per_device_eval_batch_size`: 128
441
  - `learning_rate`: 2e-05
442
  - `weight_decay`: 0.0001
443
+ - `max_steps`: 3000
444
  - `warmup_ratio`: 0.1
445
  - `fp16`: True
446
  - `dataloader_drop_last`: True
 
474
  - `adam_epsilon`: 1e-08
475
  - `max_grad_norm`: 1.0
476
  - `num_train_epochs`: 3.0
477
+ - `max_steps`: 3000
478
  - `lr_scheduler_type`: linear
479
  - `lr_scheduler_kwargs`: {}
480
  - `warmup_ratio`: 0.1
 
581
  ### Training Logs
582
  | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
583
  |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
584
+ | 0 | 0 | - | 0.5439 | 0.5540 | 0.5931 | 0.5735 |
585
+ | 0.3556 | 250 | 0.61 | 0.4258 | 0.5310 | 0.5623 | 0.5466 |
586
+ | 0.7112 | 500 | 0.5484 | 0.4127 | 0.5289 | 0.5387 | 0.5338 |
587
+ | 1.0669 | 750 | 0.5286 | 0.4054 | 0.5110 | 0.5322 | 0.5216 |
588
+ | 1.4225 | 1000 | 0.5138 | 0.4005 | 0.5065 | 0.5266 | 0.5165 |
589
+ | 1.7781 | 1250 | 0.508 | 0.3972 | 0.4863 | 0.5172 | 0.5018 |
590
+ | 2.1337 | 1500 | 0.4986 | 0.3955 | 0.4837 | 0.5191 | 0.5014 |
591
+ | 2.4893 | 1750 | 0.4936 | 0.3933 | 0.4908 | 0.5175 | 0.5041 |
592
+ | 2.8450 | 2000 | 0.4896 | 0.3920 | 0.4867 | 0.4974 | 0.4920 |
593
+ | 3.2006 | 2250 | 0.486 | 0.3910 | 0.4820 | 0.4963 | 0.4891 |
594
+ | 3.5562 | 2500 | 0.482 | 0.3903 | 0.4814 | 0.4961 | 0.4887 |
595
+ | 3.9118 | 2750 | 0.481 | 0.3897 | 0.4877 | 0.4956 | 0.4917 |
596
+ | 4.2674 | 3000 | 0.4798 | 0.3894 | 0.4877 | 0.4960 | 0.4919 |
 
 
 
 
 
 
 
 
597
 
598
 
599
  ### Framework Versions