radoslavralev commited on
Commit
4fed24b
·
verified ·
1 Parent(s): feea729

Add new SentenceTransformer model

Browse files
Files changed (1) hide show
  1. README.md +202 -138
README.md CHANGED
@@ -5,39 +5,103 @@ tags:
5
  - feature-extraction
6
  - dense
7
  - generated_from_trainer
8
- - dataset_size:89998
9
  - loss:MultipleNegativesRankingLoss
10
  base_model: thenlper/gte-small
11
  widget:
12
- - source_sentence: Indian university which follow" international education "type system?
13
  sentences:
14
- - Indian university which follow" international education "type system?
15
- - Why should we learn to play the violin?
16
- - How can you best describe the Boston tea party?
17
- - source_sentence: Why is it that when I write I sound like a genius, but when I have
18
- to speak I sound stupid?
 
 
 
 
 
 
 
 
 
 
 
19
  sentences:
20
- - Why is it that when I write I sound like a genius, but when I have to speak I
21
- sound stupid?
22
- - I want to send a happy birthday message to the man I love, but I don't want to
23
- sound obsessed (we are not together). What should I write?
24
- - Is "I really appreciate your time" correct or not?
25
- - source_sentence: Looking dropshipper for Matcha tea?
 
 
 
 
 
 
 
 
 
 
 
 
26
  sentences:
27
- - Why have European microstates managed to be independent (without being annexed)
28
- in a long European history which saw lots of changing territories?
29
- - What is the best way to decide what career to follow?
30
- - Looking dropshipper for Matcha tea?
31
- - source_sentence: What is the difference between Nordic and cross country skiing?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  sentences:
33
- - What's the difference between Nordic and Classic cross-country skiing?
34
- - 'Golf: How do I avoid topping the ball while driving?'
35
- - What is the best TV series for learning English?
36
- - source_sentence: Why do onions make people cry?
 
 
 
 
 
 
 
 
 
 
 
 
37
  sentences:
38
- - Why do onions sting?
39
- - What is manual transmission slipping?
40
- - Can people with bipolar have healthy relationships?
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  pipeline_tag: sentence-similarity
42
  library_name: sentence-transformers
43
  metrics:
@@ -67,49 +131,49 @@ model-index:
67
  type: NanoMSMARCO
68
  metrics:
69
  - type: cosine_accuracy@1
70
- value: 0.36
71
  name: Cosine Accuracy@1
72
  - type: cosine_accuracy@3
73
- value: 0.6
74
  name: Cosine Accuracy@3
75
  - type: cosine_accuracy@5
76
- value: 0.68
77
  name: Cosine Accuracy@5
78
  - type: cosine_accuracy@10
79
- value: 0.78
80
  name: Cosine Accuracy@10
81
  - type: cosine_precision@1
82
- value: 0.36
83
  name: Cosine Precision@1
84
  - type: cosine_precision@3
85
- value: 0.2
86
  name: Cosine Precision@3
87
  - type: cosine_precision@5
88
- value: 0.136
89
  name: Cosine Precision@5
90
  - type: cosine_precision@10
91
- value: 0.07800000000000001
92
  name: Cosine Precision@10
93
  - type: cosine_recall@1
94
- value: 0.36
95
  name: Cosine Recall@1
96
  - type: cosine_recall@3
97
- value: 0.6
98
  name: Cosine Recall@3
99
  - type: cosine_recall@5
100
- value: 0.68
101
  name: Cosine Recall@5
102
  - type: cosine_recall@10
103
- value: 0.78
104
  name: Cosine Recall@10
105
  - type: cosine_ndcg@10
106
- value: 0.5686788105462819
107
  name: Cosine Ndcg@10
108
  - type: cosine_mrr@10
109
- value: 0.5018888888888889
110
  name: Cosine Mrr@10
111
  - type: cosine_map@100
112
- value: 0.5110826036192063
113
  name: Cosine Map@100
114
  - task:
115
  type: information-retrieval
@@ -119,49 +183,49 @@ model-index:
119
  type: NanoNQ
120
  metrics:
121
  - type: cosine_accuracy@1
122
- value: 0.36
123
  name: Cosine Accuracy@1
124
  - type: cosine_accuracy@3
125
- value: 0.66
126
  name: Cosine Accuracy@3
127
  - type: cosine_accuracy@5
128
- value: 0.68
129
  name: Cosine Accuracy@5
130
  - type: cosine_accuracy@10
131
- value: 0.76
132
  name: Cosine Accuracy@10
133
  - type: cosine_precision@1
134
- value: 0.36
135
  name: Cosine Precision@1
136
  - type: cosine_precision@3
137
- value: 0.22
138
  name: Cosine Precision@3
139
  - type: cosine_precision@5
140
- value: 0.14
141
  name: Cosine Precision@5
142
  - type: cosine_precision@10
143
- value: 0.08
144
  name: Cosine Precision@10
145
  - type: cosine_recall@1
146
- value: 0.33
147
  name: Cosine Recall@1
148
  - type: cosine_recall@3
149
- value: 0.62
150
  name: Cosine Recall@3
151
  - type: cosine_recall@5
152
- value: 0.65
153
  name: Cosine Recall@5
154
  - type: cosine_recall@10
155
- value: 0.72
156
  name: Cosine Recall@10
157
  - type: cosine_ndcg@10
158
- value: 0.547217901995397
159
  name: Cosine Ndcg@10
160
  - type: cosine_mrr@10
161
- value: 0.5098571428571428
162
  name: Cosine Mrr@10
163
  - type: cosine_map@100
164
- value: 0.4872849044614519
165
  name: Cosine Map@100
166
  - task:
167
  type: nano-beir
@@ -171,49 +235,49 @@ model-index:
171
  type: NanoBEIR_mean
172
  metrics:
173
  - type: cosine_accuracy@1
174
- value: 0.36
175
  name: Cosine Accuracy@1
176
  - type: cosine_accuracy@3
177
- value: 0.63
178
  name: Cosine Accuracy@3
179
  - type: cosine_accuracy@5
180
- value: 0.68
181
  name: Cosine Accuracy@5
182
  - type: cosine_accuracy@10
183
- value: 0.77
184
  name: Cosine Accuracy@10
185
  - type: cosine_precision@1
186
- value: 0.36
187
  name: Cosine Precision@1
188
  - type: cosine_precision@3
189
- value: 0.21000000000000002
190
  name: Cosine Precision@3
191
  - type: cosine_precision@5
192
- value: 0.138
193
  name: Cosine Precision@5
194
  - type: cosine_precision@10
195
- value: 0.07900000000000001
196
  name: Cosine Precision@10
197
  - type: cosine_recall@1
198
- value: 0.345
199
  name: Cosine Recall@1
200
  - type: cosine_recall@3
201
- value: 0.61
202
  name: Cosine Recall@3
203
  - type: cosine_recall@5
204
- value: 0.665
205
  name: Cosine Recall@5
206
  - type: cosine_recall@10
207
- value: 0.75
208
  name: Cosine Recall@10
209
  - type: cosine_ndcg@10
210
- value: 0.5579483562708394
211
  name: Cosine Ndcg@10
212
  - type: cosine_mrr@10
213
- value: 0.5058730158730158
214
  name: Cosine Mrr@10
215
  - type: cosine_map@100
216
- value: 0.4991837540403291
217
  name: Cosine Map@100
218
  ---
219
 
@@ -267,9 +331,9 @@ from sentence_transformers import SentenceTransformer
267
  model = SentenceTransformer("redis/model-a-baseline")
268
  # Run inference
269
  sentences = [
270
- 'Why do onions make people cry?',
271
- 'Why do onions sting?',
272
- 'Can people with bipolar have healthy relationships?',
273
  ]
274
  embeddings = model.encode(sentences)
275
  print(embeddings.shape)
@@ -278,9 +342,9 @@ print(embeddings.shape)
278
  # Get the similarity scores for the embeddings
279
  similarities = model.similarity(embeddings, embeddings)
280
  print(similarities)
281
- # tensor([[1.0000, 0.7041, 0.1992],
282
- # [0.7041, 1.0000, 0.0598],
283
- # [0.1992, 0.0598, 1.0000]])
284
  ```
285
 
286
  <!--
@@ -318,21 +382,21 @@ You can finetune this model on your own dataset.
318
 
319
  | Metric | NanoMSMARCO | NanoNQ |
320
  |:--------------------|:------------|:-----------|
321
- | cosine_accuracy@1 | 0.36 | 0.36 |
322
- | cosine_accuracy@3 | 0.6 | 0.66 |
323
- | cosine_accuracy@5 | 0.68 | 0.68 |
324
- | cosine_accuracy@10 | 0.78 | 0.76 |
325
- | cosine_precision@1 | 0.36 | 0.36 |
326
- | cosine_precision@3 | 0.2 | 0.22 |
327
- | cosine_precision@5 | 0.136 | 0.14 |
328
- | cosine_precision@10 | 0.078 | 0.08 |
329
- | cosine_recall@1 | 0.36 | 0.33 |
330
- | cosine_recall@3 | 0.6 | 0.62 |
331
- | cosine_recall@5 | 0.68 | 0.65 |
332
- | cosine_recall@10 | 0.78 | 0.72 |
333
- | **cosine_ndcg@10** | **0.5687** | **0.5472** |
334
- | cosine_mrr@10 | 0.5019 | 0.5099 |
335
- | cosine_map@100 | 0.5111 | 0.4873 |
336
 
337
  #### Nano BEIR
338
 
@@ -350,21 +414,21 @@ You can finetune this model on your own dataset.
350
 
351
  | Metric | Value |
352
  |:--------------------|:-----------|
353
- | cosine_accuracy@1 | 0.36 |
354
- | cosine_accuracy@3 | 0.63 |
355
- | cosine_accuracy@5 | 0.68 |
356
- | cosine_accuracy@10 | 0.77 |
357
- | cosine_precision@1 | 0.36 |
358
- | cosine_precision@3 | 0.21 |
359
- | cosine_precision@5 | 0.138 |
360
- | cosine_precision@10 | 0.079 |
361
- | cosine_recall@1 | 0.345 |
362
- | cosine_recall@3 | 0.61 |
363
- | cosine_recall@5 | 0.665 |
364
- | cosine_recall@10 | 0.75 |
365
- | **cosine_ndcg@10** | **0.5579** |
366
- | cosine_mrr@10 | 0.5059 |
367
- | cosine_map@100 | 0.4992 |
368
 
369
  <!--
370
  ## Bias, Risks and Limitations
@@ -384,19 +448,19 @@ You can finetune this model on your own dataset.
384
 
385
  #### Unnamed Dataset
386
 
387
- * Size: 89,998 training samples
388
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
389
  * Approximate statistics based on the first 1000 samples:
390
- | | anchor | positive | negative |
391
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
392
- | type | string | string | string |
393
- | details | <ul><li>min: 5 tokens</li><li>mean: 15.61 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 15.72 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 16.55 tokens</li><li>max: 67 tokens</li></ul> |
394
  * Samples:
395
- | anchor | positive | negative |
396
- |:-----------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------|
397
- | <code>How long did it take to develop Pokémon GO?</code> | <code>How long did it take to develop Pokémon GO?</code> | <code>Can I take more than one gym in Pokémon GO?</code> |
398
- | <code>What is the best gift you've received?</code> | <code>What is the best tangible gift you've ever received?</code> | <code>Where can I download Chaayam Poosiya Veedu (The Painted House) malayalam movie for free?</code> |
399
- | <code>Why should I bother writing/editing a Wikipedia article when it can be overwritten by anyone?</code> | <code>Why should I bother writing/editing a Wikipedia article when it can be overwritten by anyone?</code> | <code>When I write a chapter, after I finish editing it, it is way too short. How can I lengthen a chapter?</code> |
400
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
401
  ```json
402
  {
@@ -413,16 +477,16 @@ You can finetune this model on your own dataset.
413
  * Size: 10,000 evaluation samples
414
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
415
  * Approximate statistics based on the first 1000 samples:
416
- | | anchor | positive | negative |
417
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
418
- | type | string | string | string |
419
- | details | <ul><li>min: 3 tokens</li><li>mean: 15.75 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 15.86 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.66 tokens</li><li>max: 74 tokens</li></ul> |
420
  * Samples:
421
- | anchor | positive | negative |
422
- |:--------------------------------------------------------------------------|:--------------------------------------------------------------------------|:--------------------------------------------------------------------|
423
- | <code>What's it like working in IT for Goldman Sachs?</code> | <code>What's it like working in IT for Goldman Sachs?</code> | <code>What is the work done at Goldman Sachs?</code> |
424
- | <code>How did Revan build his foundation of his army in Star Wars?</code> | <code>How did Revan build his foundation of his army in Star Wars?</code> | <code>What Star Wars character deserves his/her own movie?</code> |
425
- | <code>Is C++ the best programming language to learn first?</code> | <code>Is C++ the best programming language to learn first?</code> | <code>Which programming language is the best to learn first?</code> |
426
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
427
  ```json
428
  {
@@ -581,19 +645,19 @@ You can finetune this model on your own dataset.
581
  ### Training Logs
582
  | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
583
  |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
584
- | 0 | 0 | - | 3.6614 | 0.6259 | 0.6583 | 0.6421 |
585
- | 0.3556 | 250 | 3.8825 | 3.4013 | 0.6200 | 0.6575 | 0.6388 |
586
- | 0.7112 | 500 | 3.3083 | 2.1977 | 0.6287 | 0.6387 | 0.6337 |
587
- | 1.0669 | 750 | 1.7439 | 0.6392 | 0.5543 | 0.5530 | 0.5537 |
588
- | 1.4225 | 1000 | 0.8977 | 0.5267 | 0.5526 | 0.5274 | 0.5400 |
589
- | 1.7781 | 1250 | 0.7869 | 0.5083 | 0.5426 | 0.5390 | 0.5408 |
590
- | 2.1337 | 1500 | 0.7442 | 0.4991 | 0.5412 | 0.5482 | 0.5447 |
591
- | 2.4893 | 1750 | 0.7213 | 0.4941 | 0.5553 | 0.5484 | 0.5518 |
592
- | 2.8450 | 2000 | 0.7054 | 0.4872 | 0.5635 | 0.5506 | 0.5571 |
593
- | 3.2006 | 2250 | 0.6943 | 0.4863 | 0.5696 | 0.5503 | 0.5599 |
594
- | 3.5562 | 2500 | 0.6864 | 0.4839 | 0.5681 | 0.5472 | 0.5576 |
595
- | 3.9118 | 2750 | 0.6851 | 0.4832 | 0.5687 | 0.5472 | 0.5579 |
596
- | 4.2674 | 3000 | 0.6825 | 0.4825 | 0.5687 | 0.5472 | 0.5579 |
597
 
598
 
599
  ### Framework Versions
 
5
  - feature-extraction
6
  - dense
7
  - generated_from_trainer
8
+ - dataset_size:90000
9
  - loss:MultipleNegativesRankingLoss
10
  base_model: thenlper/gte-small
11
  widget:
12
+ - source_sentence: what is the maximum i can contribute to a traditional ira
13
  sentences:
14
+ - With Roth IRAs, there are no age restrictions for contributions. Investors age
15
+ 50 and older can contribute $5,500 for 2015, plus a catch-up contribution of $1,000
16
+ for a total maximum possible IRA contribution of $6,500.
17
+ - Classically, squamous epithelia are found lining surfaces utilizing simple passive
18
+ diffusion such as the alveolar epithelium in the lungs. Specialized squamous epithelia
19
+ also form the lining of cavities such as the blood vessels (endothelium) and pericardium
20
+ (mesothelium) and the major cavities found within the body.lassically, squamous
21
+ epithelia are found lining surfaces utilizing simple passive diffusion such as
22
+ the alveolar epithelium in the lungs. Specialized squamous epithelia also form
23
+ the lining of cavities such as the blood vessels (endothelium) and pericardium
24
+ (mesothelium) and the major cavities found within the body.
25
+ - What is a Roth IRA? How is a Roth IRA different from a regular IRA? What are the
26
+ advantages of the Roth version? Who can contribute to a Roth IRA? When can I take
27
+ money out of a Roth? When do I have to withdraw money from a Roth? Which is better
28
+ for me, a Roth or traditional IRA?
29
+ - source_sentence: what is diameter
30
  sentences:
31
+ - The diameter of a circle is the length of the line through the center and touching
32
+ two points on its edge. In the figure above, drag the orange dots around and see
33
+ that the diameter never changes. Sometimes the word 'diameter' is used to refer
34
+ to the line itself.
35
+ - "If you know the radius of the circle, double it to get the diameter. The radius\
36
+ \ is the distance from the center of the circle to its edge. For example, if the\
37
+ \ radius of the circle is 4 cm, then the diameter of the circle is 4 cm x 2, or\
38
+ \ 8 cm. 2. If you know the circumference of the circle, divide it by Ï\x80 to\
39
+ \ get the diameter. Ï\x80 is equal to approximately 3.14 but you should use your\
40
+ \ calculator to get the most accurate results."
41
+ - By Tony Griffitts. A denitrator is a biological filter that removes nitrate (NO
42
+ 3) from the aquarium. A denitrator filter uses anaerobic bacteria to brake down
43
+ nitrate into nitrogen gas (N 2), which escapes into the atmosphere, the result
44
+ is nitrate free effluent.hen you first set up the filter run the aquarium water
45
+ through it at a swift rate for about 2 week. This will give bacteria time to colonize
46
+ the filter. After 2 weeks cut down the filter flow rate to about a drop or two
47
+ a second. At this time, start to add about 5ml of bacteria food a day to the filter.
48
+ - source_sentence: how do ovaries produce hormones
49
  sentences:
50
+ - 'The hormones from the brain control the levels of estrogen and progesterone released
51
+ by the female reproductive system, leading to the events of the ovarian cycle:
52
+ 1 A follicle starts to grow and begins producing the hormone estrogen.'
53
+ - "Hey, itâ\x80\x99s like riding a bike! You have the GPX file on your computer,\
54
+ \ and you just need to move or transfer it over to the Garmin device. To do this,\
55
+ \ follow these steps: Connect the Garmin to the computer with a USB cable. Check\
56
+ \ that you can â\x80\x9Cseeâ\x80\x9D the device, plus its memory card (if thereâ\x80\
57
+ \x99s one installed).In Windows, the best way to do this is to double-click the\
58
+ \ My Computer icon on your desktop.opy your GPX file into the NewFiles folder.\
59
+ \ To do this, just drag and drop the file from your computer to the NewFiles window.\
60
+ \ Or copy and paste it, whichever is easier and quicker for you. I know some people\
61
+ \ can get a bit stuck on this part, and itâ\x80\x99s often overlooked."
62
+ - But women also have testosterone. The ovaries produce both testosterone and estrogen.
63
+ Relatively small quantities of testosterone are released into your bloodstream
64
+ by the ovaries and adrenal glands. In addition to being produced by the ovaries,
65
+ estrogen is also produced by the body's fat tissue. These sex hormones are involved
66
+ in the growth, maintenance, and repair of reproductive tissues. But that's not
67
+ all. They also influence other body tissues and bone mass.
68
+ - source_sentence: weather in floyds knobs indiana
69
  sentences:
70
+ - "Weekly Weather Report for 47119, Floyds Knobs, Indiana. Looking at the weather\
71
+ \ in 47119, Floyds Knobs, Indiana over the next 7 days, the maximum temperature\
72
+ \ will be 11â\x84\x83 (or 52â\x84\x89) on Friday 9 th February at around 2 pm.\
73
+ \ In the same week the minimum temperature will be -8â\x84\x83 (or 18â\x84\x89\
74
+ ) on Thursday 8 th February at around 8 am."
75
+ - Central States Weather News - from the National Weather Service. Indiana StateInformation
76
+ - compiled by NOAA. Indiana State Police - Weather Related Road Conditions (seasonal)
77
+ National Weather Service - Local | Indiana | Hoosier Weather | IN Data. NOAA Weather
78
+ Radio - NOAA station in Richmond, Indiana is 162.500, (KHB52, formerly WXJ46).
79
+ - Lisinopril is a drug of the angiotensin-converting enzyme (ACE) inhibitor class
80
+ used primarily in treatment of high blood pressure, heart failure, and after heart
81
+ attacks. It is also used for preventing kidney and eye complications in people
82
+ with diabetes. Its indications, contraindications, and side effects are as those
83
+ for all ACE inhibitors. Lisinopril was the third ACE inhibitor (after captopril
84
+ and enalapril) and was introduced into therapy in the early 1990s.
85
+ - source_sentence: what congressional district is cambridge, ohio in
86
  sentences:
87
+ - Ohio (OH) - 43725. 1 As of the 2010 census, zip code 43725 is located in Congressional
88
+ District 6, OH. 2 Approximately 33.9% of 43725's population lives in a low income
89
+ household, or a household with an annual income of less than $25,000.This is a
90
+ low percentage of low income households for Cambridge, but a high percentage for
91
+ Guernsey County.
92
+ - The Freedom Riders, who were recruited by the Congress of Racial Equality (CORE),
93
+ a U.S. civil rights group, departed from Washington, D.C., and attempted to integrate
94
+ facilities at bus terminals along the way into the Deep South.he 1961 Freedom
95
+ Rides sought to test a 1960 decision by the Supreme Court in Boynton v. Virginia
96
+ that segregation of interstate transportation facilities, including bus terminals,
97
+ was unconstitutional as well.
98
+ - Pennsylvania's 3rd congressional district has been represented by Republican Mike
99
+ Kelly since January 2011. He ran unopposed in the Republican primary. Missa Eaton,
100
+ Sharon resident and president of Democrat Women of Mercer County, ran unopposed
101
+ in the Democratic primary.emocrats Mark Critz, who has represented Pennsylvania's
102
+ 12th congressional district since 2010; and Jason Altmire, who has represented
103
+ Pennsylvania's 4th congressional district since 2007, both sought re-election
104
+ in the new 12th district.
105
  pipeline_tag: sentence-similarity
106
  library_name: sentence-transformers
107
  metrics:
 
131
  type: NanoMSMARCO
132
  metrics:
133
  - type: cosine_accuracy@1
134
+ value: 0.32
135
  name: Cosine Accuracy@1
136
  - type: cosine_accuracy@3
137
+ value: 0.56
138
  name: Cosine Accuracy@3
139
  - type: cosine_accuracy@5
140
+ value: 0.62
141
  name: Cosine Accuracy@5
142
  - type: cosine_accuracy@10
143
+ value: 0.74
144
  name: Cosine Accuracy@10
145
  - type: cosine_precision@1
146
+ value: 0.32
147
  name: Cosine Precision@1
148
  - type: cosine_precision@3
149
+ value: 0.18666666666666668
150
  name: Cosine Precision@3
151
  - type: cosine_precision@5
152
+ value: 0.124
153
  name: Cosine Precision@5
154
  - type: cosine_precision@10
155
+ value: 0.07400000000000001
156
  name: Cosine Precision@10
157
  - type: cosine_recall@1
158
+ value: 0.32
159
  name: Cosine Recall@1
160
  - type: cosine_recall@3
161
+ value: 0.56
162
  name: Cosine Recall@3
163
  - type: cosine_recall@5
164
+ value: 0.62
165
  name: Cosine Recall@5
166
  - type: cosine_recall@10
167
+ value: 0.74
168
  name: Cosine Recall@10
169
  - type: cosine_ndcg@10
170
+ value: 0.522258803283639
171
  name: Cosine Ndcg@10
172
  - type: cosine_mrr@10
173
+ value: 0.4534126984126985
174
  name: Cosine Mrr@10
175
  - type: cosine_map@100
176
+ value: 0.4638252994171221
177
  name: Cosine Map@100
178
  - task:
179
  type: information-retrieval
 
183
  type: NanoNQ
184
  metrics:
185
  - type: cosine_accuracy@1
186
+ value: 0.32
187
  name: Cosine Accuracy@1
188
  - type: cosine_accuracy@3
189
+ value: 0.54
190
  name: Cosine Accuracy@3
191
  - type: cosine_accuracy@5
192
+ value: 0.54
193
  name: Cosine Accuracy@5
194
  - type: cosine_accuracy@10
195
+ value: 0.74
196
  name: Cosine Accuracy@10
197
  - type: cosine_precision@1
198
+ value: 0.32
199
  name: Cosine Precision@1
200
  - type: cosine_precision@3
201
+ value: 0.18666666666666665
202
  name: Cosine Precision@3
203
  - type: cosine_precision@5
204
+ value: 0.11600000000000002
205
  name: Cosine Precision@5
206
  - type: cosine_precision@10
207
+ value: 0.07800000000000001
208
  name: Cosine Precision@10
209
  - type: cosine_recall@1
210
+ value: 0.29
211
  name: Cosine Recall@1
212
  - type: cosine_recall@3
213
+ value: 0.52
214
  name: Cosine Recall@3
215
  - type: cosine_recall@5
216
+ value: 0.53
217
  name: Cosine Recall@5
218
  - type: cosine_recall@10
219
+ value: 0.71
220
  name: Cosine Recall@10
221
  - type: cosine_ndcg@10
222
+ value: 0.5000975716574514
223
  name: Cosine Ndcg@10
224
  - type: cosine_mrr@10
225
+ value: 0.44801587301587303
226
  name: Cosine Mrr@10
227
  - type: cosine_map@100
228
+ value: 0.435720694785091
229
  name: Cosine Map@100
230
  - task:
231
  type: nano-beir
 
235
  type: NanoBEIR_mean
236
  metrics:
237
  - type: cosine_accuracy@1
238
+ value: 0.32
239
  name: Cosine Accuracy@1
240
  - type: cosine_accuracy@3
241
+ value: 0.55
242
  name: Cosine Accuracy@3
243
  - type: cosine_accuracy@5
244
+ value: 0.5800000000000001
245
  name: Cosine Accuracy@5
246
  - type: cosine_accuracy@10
247
+ value: 0.74
248
  name: Cosine Accuracy@10
249
  - type: cosine_precision@1
250
+ value: 0.32
251
  name: Cosine Precision@1
252
  - type: cosine_precision@3
253
+ value: 0.18666666666666665
254
  name: Cosine Precision@3
255
  - type: cosine_precision@5
256
+ value: 0.12000000000000001
257
  name: Cosine Precision@5
258
  - type: cosine_precision@10
259
+ value: 0.07600000000000001
260
  name: Cosine Precision@10
261
  - type: cosine_recall@1
262
+ value: 0.305
263
  name: Cosine Recall@1
264
  - type: cosine_recall@3
265
+ value: 0.54
266
  name: Cosine Recall@3
267
  - type: cosine_recall@5
268
+ value: 0.575
269
  name: Cosine Recall@5
270
  - type: cosine_recall@10
271
+ value: 0.725
272
  name: Cosine Recall@10
273
  - type: cosine_ndcg@10
274
+ value: 0.5111781874705452
275
  name: Cosine Ndcg@10
276
  - type: cosine_mrr@10
277
+ value: 0.45071428571428573
278
  name: Cosine Mrr@10
279
  - type: cosine_map@100
280
+ value: 0.4497729971011065
281
  name: Cosine Map@100
282
  ---
283
 
 
331
  model = SentenceTransformer("redis/model-a-baseline")
332
  # Run inference
333
  sentences = [
334
+ 'what congressional district is cambridge, ohio in',
335
+ "Ohio (OH) - 43725. 1 As of the 2010 census, zip code 43725 is located in Congressional District 6, OH. 2 Approximately 33.9% of 43725's population lives in a low income household, or a household with an annual income of less than $25,000.This is a low percentage of low income households for Cambridge, but a high percentage for Guernsey County.",
336
+ "Pennsylvania's 3rd congressional district has been represented by Republican Mike Kelly since January 2011. He ran unopposed in the Republican primary. Missa Eaton, Sharon resident and president of Democrat Women of Mercer County, ran unopposed in the Democratic primary.emocrats Mark Critz, who has represented Pennsylvania's 12th congressional district since 2010; and Jason Altmire, who has represented Pennsylvania's 4th congressional district since 2007, both sought re-election in the new 12th district.",
337
  ]
338
  embeddings = model.encode(sentences)
339
  print(embeddings.shape)
 
342
  # Get the similarity scores for the embeddings
343
  similarities = model.similarity(embeddings, embeddings)
344
  print(similarities)
345
+ # tensor([[1.0000, 0.5348, 0.4287],
346
+ # [0.5348, 1.0000, 0.3212],
347
+ # [0.4287, 0.3212, 1.0000]])
348
  ```
349
 
350
  <!--
 
382
 
383
  | Metric | NanoMSMARCO | NanoNQ |
384
  |:--------------------|:------------|:-----------|
385
+ | cosine_accuracy@1 | 0.32 | 0.32 |
386
+ | cosine_accuracy@3 | 0.56 | 0.54 |
387
+ | cosine_accuracy@5 | 0.62 | 0.54 |
388
+ | cosine_accuracy@10 | 0.74 | 0.74 |
389
+ | cosine_precision@1 | 0.32 | 0.32 |
390
+ | cosine_precision@3 | 0.1867 | 0.1867 |
391
+ | cosine_precision@5 | 0.124 | 0.116 |
392
+ | cosine_precision@10 | 0.074 | 0.078 |
393
+ | cosine_recall@1 | 0.32 | 0.29 |
394
+ | cosine_recall@3 | 0.56 | 0.52 |
395
+ | cosine_recall@5 | 0.62 | 0.53 |
396
+ | cosine_recall@10 | 0.74 | 0.71 |
397
+ | **cosine_ndcg@10** | **0.5223** | **0.5001** |
398
+ | cosine_mrr@10 | 0.4534 | 0.448 |
399
+ | cosine_map@100 | 0.4638 | 0.4357 |
400
 
401
  #### Nano BEIR
402
 
 
414
 
415
  | Metric | Value |
416
  |:--------------------|:-----------|
417
+ | cosine_accuracy@1 | 0.32 |
418
+ | cosine_accuracy@3 | 0.55 |
419
+ | cosine_accuracy@5 | 0.58 |
420
+ | cosine_accuracy@10 | 0.74 |
421
+ | cosine_precision@1 | 0.32 |
422
+ | cosine_precision@3 | 0.1867 |
423
+ | cosine_precision@5 | 0.12 |
424
+ | cosine_precision@10 | 0.076 |
425
+ | cosine_recall@1 | 0.305 |
426
+ | cosine_recall@3 | 0.54 |
427
+ | cosine_recall@5 | 0.575 |
428
+ | cosine_recall@10 | 0.725 |
429
+ | **cosine_ndcg@10** | **0.5112** |
430
+ | cosine_mrr@10 | 0.4507 |
431
+ | cosine_map@100 | 0.4498 |
432
 
433
  <!--
434
  ## Bias, Risks and Limitations
 
448
 
449
  #### Unnamed Dataset
450
 
451
+ * Size: 90,000 training samples
452
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
453
  * Approximate statistics based on the first 1000 samples:
454
+ | | anchor | positive | negative |
455
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
456
+ | type | string | string | string |
457
+ | details | <ul><li>min: 4 tokens</li><li>mean: 9.18 tokens</li><li>max: 43 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 78.75 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 77.97 tokens</li><li>max: 128 tokens</li></ul> |
458
  * Samples:
459
+ | anchor | positive | negative |
460
+ |:--------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
461
+ | <code>pomp definition</code> | <code>Pomp is a ceremonial display, such as you'd find at the Independence Day parade in your town, where brass bands and men and women in full military dress march to patriotic songs, while citizens wave flags and cheer.</code> | <code>Pom-poms are shaken by cheerleaders, Pom or Dance teams and sports fans during spectator sports. Small decorative pom-poms may be attached to clothing; these are sometimes called toories or bobbles. Pom-pom is derived from the French word pompon, which refers to a small decorative ball made of fabric or feathers. pair of cheerleading pom-poms. A pom-pom – also spelled pom-pon, pompom or pompon – is a loose, fluffy, decorative ball or tuft of fibrous material. Pom-poms may come in many colors, sizes, and varieties and are made from a wide array of materials, including wool, cotton, paper, plastic, and occasionally feathers.</code> |
462
+ | <code>what is the definition of recompense</code> | <code>verb. To recompense is to pay someone back or make amends to someone for some loss. An example of recompense is when a shoplifter gives money to the person from whom he stole.</code> | <code>Split and merge into it. Answered by The Community. Making the world better, one answer at a time. Recuse is a legal term used when a person disqualifies oneself (as a judge) in a legal case due to a potential prejudice or partiality. Example: The judge recused himself from that case, citing a possible conflict of interest. Excuse is to release a person from an obligation or duty. Example: The gentleman is excused from jury duty as his serving would cause a hardship for his family.</code> |
463
+ | <code>kashubian language pronunciation</code> | <code>Kashubian is a member of the West Slavic group of Slavic languages with about 200,000 speakers and used as an everyday language by about 53,000 people.ashubian (kaszebsczi kaszëbsczi). Jazek jãzëk kashubian is a member Of The west slavic Group of slavic languages 200,000 about 200000 speakers and used as an everyday language 53,000 about. 53000 people</code> | <code>[ syll. ko-to-ko, kot-oko ] The baby girl name Kotoko is pronounced as KAHT OW Kow †. Kotoko's origin and use are both in the Japanese language.Kotoko is a form of the Japanese name Koto. Kotoko is irregularly used as a baby girl name.aby names that sound like Kotoko include Kadea, Kadeejah, Kadeesha, Kadeija, Kadeja, Kadesha, Kadeshia, Kadesia, Kadessa, Kadiesha, Kadija (African, Arabic, English, and Swahili), Kadijah (African, Arabic, and English), Kadisha, Kadya, Kadyja, Kadysha, Kaitaka, Kathia, Kathya, and Katica (Czech, Hungarian, and Slavic).</code> |
464
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
465
  ```json
466
  {
 
477
  * Size: 10,000 evaluation samples
478
  * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
479
  * Approximate statistics based on the first 1000 samples:
480
+ | | anchor | positive | negative |
481
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
482
+ | type | string | string | string |
483
+ | details | <ul><li>min: 4 tokens</li><li>mean: 9.18 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 78.83 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 76.86 tokens</li><li>max: 128 tokens</li></ul> |
484
  * Samples:
485
+ | anchor | positive | negative |
486
+ |:----------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
487
+ | <code>what is the pantanal</code> | <code>The Pantanal is a region in South America lying mostly in Western Brazil but extending into Bolivia as well. It is considered one of the world's largest and most diverse freshwater wetland ecosystems. The Pantanal is also one of Brazil's major tourist draws, for its wildlife.he Pantanal is a region in South America lying mostly in Western Brazil but extending into Bolivia as well. It is considered one of the world's largest and most diverse freshwater wetland ecosystems. The Pantanal is also one of Brazil's major tourist draws, for its wildlife.</code> | <code>Lantana is a large plant genus with about 150 species of flowering plants, which are perennial and native to the West Indies, according to the University of Florida. Lantanas are grown for their attractive clusters of small, multicolored or single-colored flowers and for their medicinal uses.</code> |
488
+ | <code>radon mitigation cost</code> | <code>1 Ground water wells can also be tested for radon, then, if needed, can get a water radon mitigation system installed. 2 Installing a water radon mitigation system runs from $1,000-$4,500, and maintenance runs $0 to $150 annually. 3 To find out about radon content in a city water system, call the local water provider. It's wise to retest a home's radon level every year or two after a mitigation system is installed. 2 Most radon mitigation systems include a fan, which will need to be replaced about every 5 years. 3 Expect to pay $250-$300 to have this necessary maintenance done.</code> | <code>You have tested your home for radon and confirmed that you have elevated radon levels — 4 picocuries per liter (pCi/L) or higher. The EPA recommends that you take action to reduce your home's radon levels if your radon test result is 4 pCi/L or higher.High radon levels can be reduced through mitigation. CLICK HERE to order a test kit. 1 Select a licensed or certified radon mitigation contractor to reduce the radon levels.2 Have mitigation contractor determine appropriate radon reduction method.igh radon levels can be reduced through mitigation. CLICK HERE to order a test kit. 1 Select a licensed or certified radon mitigation contractor to reduce the radon levels. 2 Have mitigation contractor determine appropriate radon reduction method.</code> |
489
+ | <code>how many calories is an einstein bagel</code> | <code>1 Calories In Einstein Bagel Strawberry Balsamic Vinaigrette. 144 calories, 14g fat, 5g carbs, 0g protein, 1g fiber. Calories In Einstein Bagel Strawberry Chicken Salad. 397 calories, 10g fat, 20g carbs, 56g protein, 4g fiber.</code> | <code>There are 110 calories in a 1 bagel serving of Thomas' Bagel Thins - 100% Whole Wheat. Calorie breakdown: 7% fat, 74% carbs, 19% protein.</code> |
490
  * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
491
  ```json
492
  {
 
645
  ### Training Logs
646
  | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
647
  |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
648
+ | 0 | 0 | - | 4.1735 | 0.6259 | 0.6583 | 0.6421 |
649
+ | 0.3556 | 250 | 4.3112 | 3.9428 | 0.6062 | 0.6473 | 0.6268 |
650
+ | 0.7112 | 500 | 3.8846 | 3.1810 | 0.5977 | 0.6251 | 0.6114 |
651
+ | 1.0669 | 750 | 2.9479 | 1.8567 | 0.5628 | 0.5082 | 0.5355 |
652
+ | 1.4225 | 1000 | 1.9026 | 1.2819 | 0.5229 | 0.4809 | 0.5019 |
653
+ | 1.7781 | 1250 | 1.5448 | 1.1522 | 0.5112 | 0.4845 | 0.4978 |
654
+ | 2.1337 | 1500 | 1.4335 | 1.1065 | 0.5183 | 0.4899 | 0.5041 |
655
+ | 2.4893 | 1750 | 1.3748 | 1.0820 | 0.5231 | 0.4896 | 0.5064 |
656
+ | 2.8450 | 2000 | 1.3445 | 1.0667 | 0.5184 | 0.4962 | 0.5073 |
657
+ | 3.2006 | 2250 | 1.3173 | 1.0577 | 0.5231 | 0.4971 | 0.5101 |
658
+ | 3.5562 | 2500 | 1.3099 | 1.0517 | 0.5165 | 0.4999 | 0.5082 |
659
+ | 3.9118 | 2750 | 1.2945 | 1.0483 | 0.5218 | 0.4999 | 0.5108 |
660
+ | 4.2674 | 3000 | 1.288 | 1.0476 | 0.5223 | 0.5001 | 0.5112 |
661
 
662
 
663
  ### Framework Versions