Add new SentenceTransformer model
Browse files
README.md
CHANGED
|
@@ -5,39 +5,103 @@ tags:
|
|
| 5 |
- feature-extraction
|
| 6 |
- dense
|
| 7 |
- generated_from_trainer
|
| 8 |
-
- dataset_size:
|
| 9 |
- loss:MultipleNegativesRankingLoss
|
| 10 |
base_model: thenlper/gte-small
|
| 11 |
widget:
|
| 12 |
-
- source_sentence:
|
| 13 |
sentences:
|
| 14 |
-
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
-
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
sentences:
|
| 20 |
-
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
sentences:
|
| 27 |
-
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
-
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
sentences:
|
| 33 |
-
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
sentences:
|
| 38 |
-
-
|
| 39 |
-
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
pipeline_tag: sentence-similarity
|
| 42 |
library_name: sentence-transformers
|
| 43 |
metrics:
|
|
@@ -67,49 +131,49 @@ model-index:
|
|
| 67 |
type: NanoMSMARCO
|
| 68 |
metrics:
|
| 69 |
- type: cosine_accuracy@1
|
| 70 |
-
value: 0.
|
| 71 |
name: Cosine Accuracy@1
|
| 72 |
- type: cosine_accuracy@3
|
| 73 |
-
value: 0.
|
| 74 |
name: Cosine Accuracy@3
|
| 75 |
- type: cosine_accuracy@5
|
| 76 |
-
value: 0.
|
| 77 |
name: Cosine Accuracy@5
|
| 78 |
- type: cosine_accuracy@10
|
| 79 |
-
value: 0.
|
| 80 |
name: Cosine Accuracy@10
|
| 81 |
- type: cosine_precision@1
|
| 82 |
-
value: 0.
|
| 83 |
name: Cosine Precision@1
|
| 84 |
- type: cosine_precision@3
|
| 85 |
-
value: 0.
|
| 86 |
name: Cosine Precision@3
|
| 87 |
- type: cosine_precision@5
|
| 88 |
-
value: 0.
|
| 89 |
name: Cosine Precision@5
|
| 90 |
- type: cosine_precision@10
|
| 91 |
-
value: 0.
|
| 92 |
name: Cosine Precision@10
|
| 93 |
- type: cosine_recall@1
|
| 94 |
-
value: 0.
|
| 95 |
name: Cosine Recall@1
|
| 96 |
- type: cosine_recall@3
|
| 97 |
-
value: 0.
|
| 98 |
name: Cosine Recall@3
|
| 99 |
- type: cosine_recall@5
|
| 100 |
-
value: 0.
|
| 101 |
name: Cosine Recall@5
|
| 102 |
- type: cosine_recall@10
|
| 103 |
-
value: 0.
|
| 104 |
name: Cosine Recall@10
|
| 105 |
- type: cosine_ndcg@10
|
| 106 |
-
value: 0.
|
| 107 |
name: Cosine Ndcg@10
|
| 108 |
- type: cosine_mrr@10
|
| 109 |
-
value: 0.
|
| 110 |
name: Cosine Mrr@10
|
| 111 |
- type: cosine_map@100
|
| 112 |
-
value: 0.
|
| 113 |
name: Cosine Map@100
|
| 114 |
- task:
|
| 115 |
type: information-retrieval
|
|
@@ -119,49 +183,49 @@ model-index:
|
|
| 119 |
type: NanoNQ
|
| 120 |
metrics:
|
| 121 |
- type: cosine_accuracy@1
|
| 122 |
-
value: 0.
|
| 123 |
name: Cosine Accuracy@1
|
| 124 |
- type: cosine_accuracy@3
|
| 125 |
-
value: 0.
|
| 126 |
name: Cosine Accuracy@3
|
| 127 |
- type: cosine_accuracy@5
|
| 128 |
-
value: 0.
|
| 129 |
name: Cosine Accuracy@5
|
| 130 |
- type: cosine_accuracy@10
|
| 131 |
-
value: 0.
|
| 132 |
name: Cosine Accuracy@10
|
| 133 |
- type: cosine_precision@1
|
| 134 |
-
value: 0.
|
| 135 |
name: Cosine Precision@1
|
| 136 |
- type: cosine_precision@3
|
| 137 |
-
value: 0.
|
| 138 |
name: Cosine Precision@3
|
| 139 |
- type: cosine_precision@5
|
| 140 |
-
value: 0.
|
| 141 |
name: Cosine Precision@5
|
| 142 |
- type: cosine_precision@10
|
| 143 |
-
value: 0.
|
| 144 |
name: Cosine Precision@10
|
| 145 |
- type: cosine_recall@1
|
| 146 |
-
value: 0.
|
| 147 |
name: Cosine Recall@1
|
| 148 |
- type: cosine_recall@3
|
| 149 |
-
value: 0.
|
| 150 |
name: Cosine Recall@3
|
| 151 |
- type: cosine_recall@5
|
| 152 |
-
value: 0.
|
| 153 |
name: Cosine Recall@5
|
| 154 |
- type: cosine_recall@10
|
| 155 |
-
value: 0.
|
| 156 |
name: Cosine Recall@10
|
| 157 |
- type: cosine_ndcg@10
|
| 158 |
-
value: 0.
|
| 159 |
name: Cosine Ndcg@10
|
| 160 |
- type: cosine_mrr@10
|
| 161 |
-
value: 0.
|
| 162 |
name: Cosine Mrr@10
|
| 163 |
- type: cosine_map@100
|
| 164 |
-
value: 0.
|
| 165 |
name: Cosine Map@100
|
| 166 |
- task:
|
| 167 |
type: nano-beir
|
|
@@ -171,49 +235,49 @@ model-index:
|
|
| 171 |
type: NanoBEIR_mean
|
| 172 |
metrics:
|
| 173 |
- type: cosine_accuracy@1
|
| 174 |
-
value: 0.
|
| 175 |
name: Cosine Accuracy@1
|
| 176 |
- type: cosine_accuracy@3
|
| 177 |
-
value: 0.
|
| 178 |
name: Cosine Accuracy@3
|
| 179 |
- type: cosine_accuracy@5
|
| 180 |
-
value: 0.
|
| 181 |
name: Cosine Accuracy@5
|
| 182 |
- type: cosine_accuracy@10
|
| 183 |
-
value: 0.
|
| 184 |
name: Cosine Accuracy@10
|
| 185 |
- type: cosine_precision@1
|
| 186 |
-
value: 0.
|
| 187 |
name: Cosine Precision@1
|
| 188 |
- type: cosine_precision@3
|
| 189 |
-
value: 0.
|
| 190 |
name: Cosine Precision@3
|
| 191 |
- type: cosine_precision@5
|
| 192 |
-
value: 0.
|
| 193 |
name: Cosine Precision@5
|
| 194 |
- type: cosine_precision@10
|
| 195 |
-
value: 0.
|
| 196 |
name: Cosine Precision@10
|
| 197 |
- type: cosine_recall@1
|
| 198 |
-
value: 0.
|
| 199 |
name: Cosine Recall@1
|
| 200 |
- type: cosine_recall@3
|
| 201 |
-
value: 0.
|
| 202 |
name: Cosine Recall@3
|
| 203 |
- type: cosine_recall@5
|
| 204 |
-
value: 0.
|
| 205 |
name: Cosine Recall@5
|
| 206 |
- type: cosine_recall@10
|
| 207 |
-
value: 0.
|
| 208 |
name: Cosine Recall@10
|
| 209 |
- type: cosine_ndcg@10
|
| 210 |
-
value: 0.
|
| 211 |
name: Cosine Ndcg@10
|
| 212 |
- type: cosine_mrr@10
|
| 213 |
-
value: 0.
|
| 214 |
name: Cosine Mrr@10
|
| 215 |
- type: cosine_map@100
|
| 216 |
-
value: 0.
|
| 217 |
name: Cosine Map@100
|
| 218 |
---
|
| 219 |
|
|
@@ -267,9 +331,9 @@ from sentence_transformers import SentenceTransformer
|
|
| 267 |
model = SentenceTransformer("redis/model-a-baseline")
|
| 268 |
# Run inference
|
| 269 |
sentences = [
|
| 270 |
-
'
|
| 271 |
-
'
|
| 272 |
-
'
|
| 273 |
]
|
| 274 |
embeddings = model.encode(sentences)
|
| 275 |
print(embeddings.shape)
|
|
@@ -278,9 +342,9 @@ print(embeddings.shape)
|
|
| 278 |
# Get the similarity scores for the embeddings
|
| 279 |
similarities = model.similarity(embeddings, embeddings)
|
| 280 |
print(similarities)
|
| 281 |
-
# tensor([[1.0000, 0.
|
| 282 |
-
# [0.
|
| 283 |
-
# [0.
|
| 284 |
```
|
| 285 |
|
| 286 |
<!--
|
|
@@ -318,21 +382,21 @@ You can finetune this model on your own dataset.
|
|
| 318 |
|
| 319 |
| Metric | NanoMSMARCO | NanoNQ |
|
| 320 |
|:--------------------|:------------|:-----------|
|
| 321 |
-
| cosine_accuracy@1 | 0.
|
| 322 |
-
| cosine_accuracy@3 | 0.
|
| 323 |
-
| cosine_accuracy@5 | 0.
|
| 324 |
-
| cosine_accuracy@10 | 0.
|
| 325 |
-
| cosine_precision@1 | 0.
|
| 326 |
-
| cosine_precision@3 | 0.
|
| 327 |
-
| cosine_precision@5 | 0.
|
| 328 |
-
| cosine_precision@10 | 0.
|
| 329 |
-
| cosine_recall@1 | 0.
|
| 330 |
-
| cosine_recall@3 | 0.
|
| 331 |
-
| cosine_recall@5 | 0.
|
| 332 |
-
| cosine_recall@10 | 0.
|
| 333 |
-
| **cosine_ndcg@10** | **0.
|
| 334 |
-
| cosine_mrr@10 | 0.
|
| 335 |
-
| cosine_map@100 | 0.
|
| 336 |
|
| 337 |
#### Nano BEIR
|
| 338 |
|
|
@@ -350,21 +414,21 @@ You can finetune this model on your own dataset.
|
|
| 350 |
|
| 351 |
| Metric | Value |
|
| 352 |
|:--------------------|:-----------|
|
| 353 |
-
| cosine_accuracy@1 | 0.
|
| 354 |
-
| cosine_accuracy@3 | 0.
|
| 355 |
-
| cosine_accuracy@5 | 0.
|
| 356 |
-
| cosine_accuracy@10 | 0.
|
| 357 |
-
| cosine_precision@1 | 0.
|
| 358 |
-
| cosine_precision@3 | 0.
|
| 359 |
-
| cosine_precision@5 | 0.
|
| 360 |
-
| cosine_precision@10 | 0.
|
| 361 |
-
| cosine_recall@1 | 0.
|
| 362 |
-
| cosine_recall@3 | 0.
|
| 363 |
-
| cosine_recall@5 | 0.
|
| 364 |
-
| cosine_recall@10 | 0.
|
| 365 |
-
| **cosine_ndcg@10** | **0.
|
| 366 |
-
| cosine_mrr@10 | 0.
|
| 367 |
-
| cosine_map@100 | 0.
|
| 368 |
|
| 369 |
<!--
|
| 370 |
## Bias, Risks and Limitations
|
|
@@ -384,19 +448,19 @@ You can finetune this model on your own dataset.
|
|
| 384 |
|
| 385 |
#### Unnamed Dataset
|
| 386 |
|
| 387 |
-
* Size:
|
| 388 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
| 389 |
* Approximate statistics based on the first 1000 samples:
|
| 390 |
-
| | anchor
|
| 391 |
-
|
| 392 |
-
| type | string
|
| 393 |
-
| details | <ul><li>min:
|
| 394 |
* Samples:
|
| 395 |
-
| anchor
|
| 396 |
-
|
| 397 |
-
| <code>
|
| 398 |
-
| <code>
|
| 399 |
-
| <code>
|
| 400 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 401 |
```json
|
| 402 |
{
|
|
@@ -413,16 +477,16 @@ You can finetune this model on your own dataset.
|
|
| 413 |
* Size: 10,000 evaluation samples
|
| 414 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
| 415 |
* Approximate statistics based on the first 1000 samples:
|
| 416 |
-
| | anchor
|
| 417 |
-
|
| 418 |
-
| type | string
|
| 419 |
-
| details | <ul><li>min:
|
| 420 |
* Samples:
|
| 421 |
-
| anchor
|
| 422 |
-
|
| 423 |
-
| <code>
|
| 424 |
-
| <code>
|
| 425 |
-
| <code>
|
| 426 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 427 |
```json
|
| 428 |
{
|
|
@@ -581,19 +645,19 @@ You can finetune this model on your own dataset.
|
|
| 581 |
### Training Logs
|
| 582 |
| Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
|
| 583 |
|:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
|
| 584 |
-
| 0 | 0 | - |
|
| 585 |
-
| 0.3556 | 250 |
|
| 586 |
-
| 0.7112 | 500 | 3.
|
| 587 |
-
| 1.0669 | 750 |
|
| 588 |
-
| 1.4225 | 1000 |
|
| 589 |
-
| 1.7781 | 1250 |
|
| 590 |
-
| 2.1337 | 1500 |
|
| 591 |
-
| 2.4893 | 1750 |
|
| 592 |
-
| 2.8450 | 2000 |
|
| 593 |
-
| 3.2006 | 2250 |
|
| 594 |
-
| 3.5562 | 2500 |
|
| 595 |
-
| 3.9118 | 2750 |
|
| 596 |
-
| 4.2674 | 3000 |
|
| 597 |
|
| 598 |
|
| 599 |
### Framework Versions
|
|
|
|
| 5 |
- feature-extraction
|
| 6 |
- dense
|
| 7 |
- generated_from_trainer
|
| 8 |
+
- dataset_size:90000
|
| 9 |
- loss:MultipleNegativesRankingLoss
|
| 10 |
base_model: thenlper/gte-small
|
| 11 |
widget:
|
| 12 |
+
- source_sentence: what is the maximum i can contribute to a traditional ira
|
| 13 |
sentences:
|
| 14 |
+
- With Roth IRAs, there are no age restrictions for contributions. Investors age
|
| 15 |
+
50 and older can contribute $5,500 for 2015, plus a catch-up contribution of $1,000
|
| 16 |
+
for a total maximum possible IRA contribution of $6,500.
|
| 17 |
+
- Classically, squamous epithelia are found lining surfaces utilizing simple passive
|
| 18 |
+
diffusion such as the alveolar epithelium in the lungs. Specialized squamous epithelia
|
| 19 |
+
also form the lining of cavities such as the blood vessels (endothelium) and pericardium
|
| 20 |
+
(mesothelium) and the major cavities found within the body.lassically, squamous
|
| 21 |
+
epithelia are found lining surfaces utilizing simple passive diffusion such as
|
| 22 |
+
the alveolar epithelium in the lungs. Specialized squamous epithelia also form
|
| 23 |
+
the lining of cavities such as the blood vessels (endothelium) and pericardium
|
| 24 |
+
(mesothelium) and the major cavities found within the body.
|
| 25 |
+
- What is a Roth IRA? How is a Roth IRA different from a regular IRA? What are the
|
| 26 |
+
advantages of the Roth version? Who can contribute to a Roth IRA? When can I take
|
| 27 |
+
money out of a Roth? When do I have to withdraw money from a Roth? Which is better
|
| 28 |
+
for me, a Roth or traditional IRA?
|
| 29 |
+
- source_sentence: what is diameter
|
| 30 |
sentences:
|
| 31 |
+
- The diameter of a circle is the length of the line through the center and touching
|
| 32 |
+
two points on its edge. In the figure above, drag the orange dots around and see
|
| 33 |
+
that the diameter never changes. Sometimes the word 'diameter' is used to refer
|
| 34 |
+
to the line itself.
|
| 35 |
+
- "If you know the radius of the circle, double it to get the diameter. The radius\
|
| 36 |
+
\ is the distance from the center of the circle to its edge. For example, if the\
|
| 37 |
+
\ radius of the circle is 4 cm, then the diameter of the circle is 4 cm x 2, or\
|
| 38 |
+
\ 8 cm. 2. If you know the circumference of the circle, divide it by Ï\x80 to\
|
| 39 |
+
\ get the diameter. Ï\x80 is equal to approximately 3.14 but you should use your\
|
| 40 |
+
\ calculator to get the most accurate results."
|
| 41 |
+
- By Tony Griffitts. A denitrator is a biological filter that removes nitrate (NO
|
| 42 |
+
3) from the aquarium. A denitrator filter uses anaerobic bacteria to brake down
|
| 43 |
+
nitrate into nitrogen gas (N 2), which escapes into the atmosphere, the result
|
| 44 |
+
is nitrate free effluent.hen you first set up the filter run the aquarium water
|
| 45 |
+
through it at a swift rate for about 2 week. This will give bacteria time to colonize
|
| 46 |
+
the filter. After 2 weeks cut down the filter flow rate to about a drop or two
|
| 47 |
+
a second. At this time, start to add about 5ml of bacteria food a day to the filter.
|
| 48 |
+
- source_sentence: how do ovaries produce hormones
|
| 49 |
sentences:
|
| 50 |
+
- 'The hormones from the brain control the levels of estrogen and progesterone released
|
| 51 |
+
by the female reproductive system, leading to the events of the ovarian cycle:
|
| 52 |
+
1 A follicle starts to grow and begins producing the hormone estrogen.'
|
| 53 |
+
- "Hey, itâ\x80\x99s like riding a bike! You have the GPX file on your computer,\
|
| 54 |
+
\ and you just need to move or transfer it over to the Garmin device. To do this,\
|
| 55 |
+
\ follow these steps: Connect the Garmin to the computer with a USB cable. Check\
|
| 56 |
+
\ that you can â\x80\x9Cseeâ\x80\x9D the device, plus its memory card (if thereâ\x80\
|
| 57 |
+
\x99s one installed).In Windows, the best way to do this is to double-click the\
|
| 58 |
+
\ My Computer icon on your desktop.opy your GPX file into the NewFiles folder.\
|
| 59 |
+
\ To do this, just drag and drop the file from your computer to the NewFiles window.\
|
| 60 |
+
\ Or copy and paste it, whichever is easier and quicker for you. I know some people\
|
| 61 |
+
\ can get a bit stuck on this part, and itâ\x80\x99s often overlooked."
|
| 62 |
+
- But women also have testosterone. The ovaries produce both testosterone and estrogen.
|
| 63 |
+
Relatively small quantities of testosterone are released into your bloodstream
|
| 64 |
+
by the ovaries and adrenal glands. In addition to being produced by the ovaries,
|
| 65 |
+
estrogen is also produced by the body's fat tissue. These sex hormones are involved
|
| 66 |
+
in the growth, maintenance, and repair of reproductive tissues. But that's not
|
| 67 |
+
all. They also influence other body tissues and bone mass.
|
| 68 |
+
- source_sentence: weather in floyds knobs indiana
|
| 69 |
sentences:
|
| 70 |
+
- "Weekly Weather Report for 47119, Floyds Knobs, Indiana. Looking at the weather\
|
| 71 |
+
\ in 47119, Floyds Knobs, Indiana over the next 7 days, the maximum temperature\
|
| 72 |
+
\ will be 11â\x84\x83 (or 52â\x84\x89) on Friday 9 th February at around 2 pm.\
|
| 73 |
+
\ In the same week the minimum temperature will be -8â\x84\x83 (or 18â\x84\x89\
|
| 74 |
+
) on Thursday 8 th February at around 8 am."
|
| 75 |
+
- Central States Weather News - from the National Weather Service. Indiana StateInformation
|
| 76 |
+
- compiled by NOAA. Indiana State Police - Weather Related Road Conditions (seasonal)
|
| 77 |
+
National Weather Service - Local | Indiana | Hoosier Weather | IN Data. NOAA Weather
|
| 78 |
+
Radio - NOAA station in Richmond, Indiana is 162.500, (KHB52, formerly WXJ46).
|
| 79 |
+
- Lisinopril is a drug of the angiotensin-converting enzyme (ACE) inhibitor class
|
| 80 |
+
used primarily in treatment of high blood pressure, heart failure, and after heart
|
| 81 |
+
attacks. It is also used for preventing kidney and eye complications in people
|
| 82 |
+
with diabetes. Its indications, contraindications, and side effects are as those
|
| 83 |
+
for all ACE inhibitors. Lisinopril was the third ACE inhibitor (after captopril
|
| 84 |
+
and enalapril) and was introduced into therapy in the early 1990s.
|
| 85 |
+
- source_sentence: what congressional district is cambridge, ohio in
|
| 86 |
sentences:
|
| 87 |
+
- Ohio (OH) - 43725. 1 As of the 2010 census, zip code 43725 is located in Congressional
|
| 88 |
+
District 6, OH. 2 Approximately 33.9% of 43725's population lives in a low income
|
| 89 |
+
household, or a household with an annual income of less than $25,000.This is a
|
| 90 |
+
low percentage of low income households for Cambridge, but a high percentage for
|
| 91 |
+
Guernsey County.
|
| 92 |
+
- The Freedom Riders, who were recruited by the Congress of Racial Equality (CORE),
|
| 93 |
+
a U.S. civil rights group, departed from Washington, D.C., and attempted to integrate
|
| 94 |
+
facilities at bus terminals along the way into the Deep South.he 1961 Freedom
|
| 95 |
+
Rides sought to test a 1960 decision by the Supreme Court in Boynton v. Virginia
|
| 96 |
+
that segregation of interstate transportation facilities, including bus terminals,
|
| 97 |
+
was unconstitutional as well.
|
| 98 |
+
- Pennsylvania's 3rd congressional district has been represented by Republican Mike
|
| 99 |
+
Kelly since January 2011. He ran unopposed in the Republican primary. Missa Eaton,
|
| 100 |
+
Sharon resident and president of Democrat Women of Mercer County, ran unopposed
|
| 101 |
+
in the Democratic primary.emocrats Mark Critz, who has represented Pennsylvania's
|
| 102 |
+
12th congressional district since 2010; and Jason Altmire, who has represented
|
| 103 |
+
Pennsylvania's 4th congressional district since 2007, both sought re-election
|
| 104 |
+
in the new 12th district.
|
| 105 |
pipeline_tag: sentence-similarity
|
| 106 |
library_name: sentence-transformers
|
| 107 |
metrics:
|
|
|
|
| 131 |
type: NanoMSMARCO
|
| 132 |
metrics:
|
| 133 |
- type: cosine_accuracy@1
|
| 134 |
+
value: 0.32
|
| 135 |
name: Cosine Accuracy@1
|
| 136 |
- type: cosine_accuracy@3
|
| 137 |
+
value: 0.56
|
| 138 |
name: Cosine Accuracy@3
|
| 139 |
- type: cosine_accuracy@5
|
| 140 |
+
value: 0.62
|
| 141 |
name: Cosine Accuracy@5
|
| 142 |
- type: cosine_accuracy@10
|
| 143 |
+
value: 0.74
|
| 144 |
name: Cosine Accuracy@10
|
| 145 |
- type: cosine_precision@1
|
| 146 |
+
value: 0.32
|
| 147 |
name: Cosine Precision@1
|
| 148 |
- type: cosine_precision@3
|
| 149 |
+
value: 0.18666666666666668
|
| 150 |
name: Cosine Precision@3
|
| 151 |
- type: cosine_precision@5
|
| 152 |
+
value: 0.124
|
| 153 |
name: Cosine Precision@5
|
| 154 |
- type: cosine_precision@10
|
| 155 |
+
value: 0.07400000000000001
|
| 156 |
name: Cosine Precision@10
|
| 157 |
- type: cosine_recall@1
|
| 158 |
+
value: 0.32
|
| 159 |
name: Cosine Recall@1
|
| 160 |
- type: cosine_recall@3
|
| 161 |
+
value: 0.56
|
| 162 |
name: Cosine Recall@3
|
| 163 |
- type: cosine_recall@5
|
| 164 |
+
value: 0.62
|
| 165 |
name: Cosine Recall@5
|
| 166 |
- type: cosine_recall@10
|
| 167 |
+
value: 0.74
|
| 168 |
name: Cosine Recall@10
|
| 169 |
- type: cosine_ndcg@10
|
| 170 |
+
value: 0.522258803283639
|
| 171 |
name: Cosine Ndcg@10
|
| 172 |
- type: cosine_mrr@10
|
| 173 |
+
value: 0.4534126984126985
|
| 174 |
name: Cosine Mrr@10
|
| 175 |
- type: cosine_map@100
|
| 176 |
+
value: 0.4638252994171221
|
| 177 |
name: Cosine Map@100
|
| 178 |
- task:
|
| 179 |
type: information-retrieval
|
|
|
|
| 183 |
type: NanoNQ
|
| 184 |
metrics:
|
| 185 |
- type: cosine_accuracy@1
|
| 186 |
+
value: 0.32
|
| 187 |
name: Cosine Accuracy@1
|
| 188 |
- type: cosine_accuracy@3
|
| 189 |
+
value: 0.54
|
| 190 |
name: Cosine Accuracy@3
|
| 191 |
- type: cosine_accuracy@5
|
| 192 |
+
value: 0.54
|
| 193 |
name: Cosine Accuracy@5
|
| 194 |
- type: cosine_accuracy@10
|
| 195 |
+
value: 0.74
|
| 196 |
name: Cosine Accuracy@10
|
| 197 |
- type: cosine_precision@1
|
| 198 |
+
value: 0.32
|
| 199 |
name: Cosine Precision@1
|
| 200 |
- type: cosine_precision@3
|
| 201 |
+
value: 0.18666666666666665
|
| 202 |
name: Cosine Precision@3
|
| 203 |
- type: cosine_precision@5
|
| 204 |
+
value: 0.11600000000000002
|
| 205 |
name: Cosine Precision@5
|
| 206 |
- type: cosine_precision@10
|
| 207 |
+
value: 0.07800000000000001
|
| 208 |
name: Cosine Precision@10
|
| 209 |
- type: cosine_recall@1
|
| 210 |
+
value: 0.29
|
| 211 |
name: Cosine Recall@1
|
| 212 |
- type: cosine_recall@3
|
| 213 |
+
value: 0.52
|
| 214 |
name: Cosine Recall@3
|
| 215 |
- type: cosine_recall@5
|
| 216 |
+
value: 0.53
|
| 217 |
name: Cosine Recall@5
|
| 218 |
- type: cosine_recall@10
|
| 219 |
+
value: 0.71
|
| 220 |
name: Cosine Recall@10
|
| 221 |
- type: cosine_ndcg@10
|
| 222 |
+
value: 0.5000975716574514
|
| 223 |
name: Cosine Ndcg@10
|
| 224 |
- type: cosine_mrr@10
|
| 225 |
+
value: 0.44801587301587303
|
| 226 |
name: Cosine Mrr@10
|
| 227 |
- type: cosine_map@100
|
| 228 |
+
value: 0.435720694785091
|
| 229 |
name: Cosine Map@100
|
| 230 |
- task:
|
| 231 |
type: nano-beir
|
|
|
|
| 235 |
type: NanoBEIR_mean
|
| 236 |
metrics:
|
| 237 |
- type: cosine_accuracy@1
|
| 238 |
+
value: 0.32
|
| 239 |
name: Cosine Accuracy@1
|
| 240 |
- type: cosine_accuracy@3
|
| 241 |
+
value: 0.55
|
| 242 |
name: Cosine Accuracy@3
|
| 243 |
- type: cosine_accuracy@5
|
| 244 |
+
value: 0.5800000000000001
|
| 245 |
name: Cosine Accuracy@5
|
| 246 |
- type: cosine_accuracy@10
|
| 247 |
+
value: 0.74
|
| 248 |
name: Cosine Accuracy@10
|
| 249 |
- type: cosine_precision@1
|
| 250 |
+
value: 0.32
|
| 251 |
name: Cosine Precision@1
|
| 252 |
- type: cosine_precision@3
|
| 253 |
+
value: 0.18666666666666665
|
| 254 |
name: Cosine Precision@3
|
| 255 |
- type: cosine_precision@5
|
| 256 |
+
value: 0.12000000000000001
|
| 257 |
name: Cosine Precision@5
|
| 258 |
- type: cosine_precision@10
|
| 259 |
+
value: 0.07600000000000001
|
| 260 |
name: Cosine Precision@10
|
| 261 |
- type: cosine_recall@1
|
| 262 |
+
value: 0.305
|
| 263 |
name: Cosine Recall@1
|
| 264 |
- type: cosine_recall@3
|
| 265 |
+
value: 0.54
|
| 266 |
name: Cosine Recall@3
|
| 267 |
- type: cosine_recall@5
|
| 268 |
+
value: 0.575
|
| 269 |
name: Cosine Recall@5
|
| 270 |
- type: cosine_recall@10
|
| 271 |
+
value: 0.725
|
| 272 |
name: Cosine Recall@10
|
| 273 |
- type: cosine_ndcg@10
|
| 274 |
+
value: 0.5111781874705452
|
| 275 |
name: Cosine Ndcg@10
|
| 276 |
- type: cosine_mrr@10
|
| 277 |
+
value: 0.45071428571428573
|
| 278 |
name: Cosine Mrr@10
|
| 279 |
- type: cosine_map@100
|
| 280 |
+
value: 0.4497729971011065
|
| 281 |
name: Cosine Map@100
|
| 282 |
---
|
| 283 |
|
|
|
|
| 331 |
model = SentenceTransformer("redis/model-a-baseline")
|
| 332 |
# Run inference
|
| 333 |
sentences = [
|
| 334 |
+
'what congressional district is cambridge, ohio in',
|
| 335 |
+
"Ohio (OH) - 43725. 1 As of the 2010 census, zip code 43725 is located in Congressional District 6, OH. 2 Approximately 33.9% of 43725's population lives in a low income household, or a household with an annual income of less than $25,000.This is a low percentage of low income households for Cambridge, but a high percentage for Guernsey County.",
|
| 336 |
+
"Pennsylvania's 3rd congressional district has been represented by Republican Mike Kelly since January 2011. He ran unopposed in the Republican primary. Missa Eaton, Sharon resident and president of Democrat Women of Mercer County, ran unopposed in the Democratic primary.emocrats Mark Critz, who has represented Pennsylvania's 12th congressional district since 2010; and Jason Altmire, who has represented Pennsylvania's 4th congressional district since 2007, both sought re-election in the new 12th district.",
|
| 337 |
]
|
| 338 |
embeddings = model.encode(sentences)
|
| 339 |
print(embeddings.shape)
|
|
|
|
| 342 |
# Get the similarity scores for the embeddings
|
| 343 |
similarities = model.similarity(embeddings, embeddings)
|
| 344 |
print(similarities)
|
| 345 |
+
# tensor([[1.0000, 0.5348, 0.4287],
|
| 346 |
+
# [0.5348, 1.0000, 0.3212],
|
| 347 |
+
# [0.4287, 0.3212, 1.0000]])
|
| 348 |
```
|
| 349 |
|
| 350 |
<!--
|
|
|
|
| 382 |
|
| 383 |
| Metric | NanoMSMARCO | NanoNQ |
|
| 384 |
|:--------------------|:------------|:-----------|
|
| 385 |
+
| cosine_accuracy@1 | 0.32 | 0.32 |
|
| 386 |
+
| cosine_accuracy@3 | 0.56 | 0.54 |
|
| 387 |
+
| cosine_accuracy@5 | 0.62 | 0.54 |
|
| 388 |
+
| cosine_accuracy@10 | 0.74 | 0.74 |
|
| 389 |
+
| cosine_precision@1 | 0.32 | 0.32 |
|
| 390 |
+
| cosine_precision@3 | 0.1867 | 0.1867 |
|
| 391 |
+
| cosine_precision@5 | 0.124 | 0.116 |
|
| 392 |
+
| cosine_precision@10 | 0.074 | 0.078 |
|
| 393 |
+
| cosine_recall@1 | 0.32 | 0.29 |
|
| 394 |
+
| cosine_recall@3 | 0.56 | 0.52 |
|
| 395 |
+
| cosine_recall@5 | 0.62 | 0.53 |
|
| 396 |
+
| cosine_recall@10 | 0.74 | 0.71 |
|
| 397 |
+
| **cosine_ndcg@10** | **0.5223** | **0.5001** |
|
| 398 |
+
| cosine_mrr@10 | 0.4534 | 0.448 |
|
| 399 |
+
| cosine_map@100 | 0.4638 | 0.4357 |
|
| 400 |
|
| 401 |
#### Nano BEIR
|
| 402 |
|
|
|
|
| 414 |
|
| 415 |
| Metric | Value |
|
| 416 |
|:--------------------|:-----------|
|
| 417 |
+
| cosine_accuracy@1 | 0.32 |
|
| 418 |
+
| cosine_accuracy@3 | 0.55 |
|
| 419 |
+
| cosine_accuracy@5 | 0.58 |
|
| 420 |
+
| cosine_accuracy@10 | 0.74 |
|
| 421 |
+
| cosine_precision@1 | 0.32 |
|
| 422 |
+
| cosine_precision@3 | 0.1867 |
|
| 423 |
+
| cosine_precision@5 | 0.12 |
|
| 424 |
+
| cosine_precision@10 | 0.076 |
|
| 425 |
+
| cosine_recall@1 | 0.305 |
|
| 426 |
+
| cosine_recall@3 | 0.54 |
|
| 427 |
+
| cosine_recall@5 | 0.575 |
|
| 428 |
+
| cosine_recall@10 | 0.725 |
|
| 429 |
+
| **cosine_ndcg@10** | **0.5112** |
|
| 430 |
+
| cosine_mrr@10 | 0.4507 |
|
| 431 |
+
| cosine_map@100 | 0.4498 |
|
| 432 |
|
| 433 |
<!--
|
| 434 |
## Bias, Risks and Limitations
|
|
|
|
| 448 |
|
| 449 |
#### Unnamed Dataset
|
| 450 |
|
| 451 |
+
* Size: 90,000 training samples
|
| 452 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
| 453 |
* Approximate statistics based on the first 1000 samples:
|
| 454 |
+
| | anchor | positive | negative |
|
| 455 |
+
|:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
|
| 456 |
+
| type | string | string | string |
|
| 457 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 9.18 tokens</li><li>max: 43 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 78.75 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 77.97 tokens</li><li>max: 128 tokens</li></ul> |
|
| 458 |
* Samples:
|
| 459 |
+
| anchor | positive | negative |
|
| 460 |
+
|:--------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
| 461 |
+
| <code>pomp definition</code> | <code>Pomp is a ceremonial display, such as you'd find at the Independence Day parade in your town, where brass bands and men and women in full military dress march to patriotic songs, while citizens wave flags and cheer.</code> | <code>Pom-poms are shaken by cheerleaders, Pom or Dance teams and sports fans during spectator sports. Small decorative pom-poms may be attached to clothing; these are sometimes called toories or bobbles. Pom-pom is derived from the French word pompon, which refers to a small decorative ball made of fabric or feathers. pair of cheerleading pom-poms. A pom-pom â also spelled pom-pon, pompom or pompon â is a loose, fluffy, decorative ball or tuft of fibrous material. Pom-poms may come in many colors, sizes, and varieties and are made from a wide array of materials, including wool, cotton, paper, plastic, and occasionally feathers.</code> |
|
| 462 |
+
| <code>what is the definition of recompense</code> | <code>verb. To recompense is to pay someone back or make amends to someone for some loss. An example of recompense is when a shoplifter gives money to the person from whom he stole.</code> | <code>Split and merge into it. Answered by The Community. Making the world better, one answer at a time. Recuse is a legal term used when a person disqualifies oneself (as a judge) in a legal case due to a potential prejudice or partiality. Example: The judge recused himself from that case, citing a possible conflict of interest. Excuse is to release a person from an obligation or duty. Example: The gentleman is excused from jury duty as his serving would cause a hardship for his family.</code> |
|
| 463 |
+
| <code>kashubian language pronunciation</code> | <code>Kashubian is a member of the West Slavic group of Slavic languages with about 200,000 speakers and used as an everyday language by about 53,000 people.ashubian (kaszebsczi kaszëbsczi). Jazek jãzëk kashubian is a member Of The west slavic Group of slavic languages 200,000 about 200000 speakers and used as an everyday language 53,000 about. 53000 people</code> | <code>[ syll. ko-to-ko, kot-oko ] The baby girl name Kotoko is pronounced as KAHT OW Kow â . Kotoko's origin and use are both in the Japanese language.Kotoko is a form of the Japanese name Koto. Kotoko is irregularly used as a baby girl name.aby names that sound like Kotoko include Kadea, Kadeejah, Kadeesha, Kadeija, Kadeja, Kadesha, Kadeshia, Kadesia, Kadessa, Kadiesha, Kadija (African, Arabic, English, and Swahili), Kadijah (African, Arabic, and English), Kadisha, Kadya, Kadyja, Kadysha, Kaitaka, Kathia, Kathya, and Katica (Czech, Hungarian, and Slavic).</code> |
|
| 464 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 465 |
```json
|
| 466 |
{
|
|
|
|
| 477 |
* Size: 10,000 evaluation samples
|
| 478 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
| 479 |
* Approximate statistics based on the first 1000 samples:
|
| 480 |
+
| | anchor | positive | negative |
|
| 481 |
+
|:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
|
| 482 |
+
| type | string | string | string |
|
| 483 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 9.18 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 78.83 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 76.86 tokens</li><li>max: 128 tokens</li></ul> |
|
| 484 |
* Samples:
|
| 485 |
+
| anchor | positive | negative |
|
| 486 |
+
|:----------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
| 487 |
+
| <code>what is the pantanal</code> | <code>The Pantanal is a region in South America lying mostly in Western Brazil but extending into Bolivia as well. It is considered one of the world's largest and most diverse freshwater wetland ecosystems. The Pantanal is also one of Brazil's major tourist draws, for its wildlife.he Pantanal is a region in South America lying mostly in Western Brazil but extending into Bolivia as well. It is considered one of the world's largest and most diverse freshwater wetland ecosystems. The Pantanal is also one of Brazil's major tourist draws, for its wildlife.</code> | <code>Lantana is a large plant genus with about 150 species of flowering plants, which are perennial and native to the West Indies, according to the University of Florida. Lantanas are grown for their attractive clusters of small, multicolored or single-colored flowers and for their medicinal uses.</code> |
|
| 488 |
+
| <code>radon mitigation cost</code> | <code>1 Ground water wells can also be tested for radon, then, if needed, can get a water radon mitigation system installed. 2 Installing a water radon mitigation system runs from $1,000-$4,500, and maintenance runs $0 to $150 annually. 3 To find out about radon content in a city water system, call the local water provider. It's wise to retest a home's radon level every year or two after a mitigation system is installed. 2 Most radon mitigation systems include a fan, which will need to be replaced about every 5 years. 3 Expect to pay $250-$300 to have this necessary maintenance done.</code> | <code>You have tested your home for radon and confirmed that you have elevated radon levels â 4 picocuries per liter (pCi/L) or higher. The EPA recommends that you take action to reduce your home's radon levels if your radon test result is 4 pCi/L or higher.High radon levels can be reduced through mitigation. CLICK HERE to order a test kit. 1 Select a licensed or certified radon mitigation contractor to reduce the radon levels.2 Have mitigation contractor determine appropriate radon reduction method.igh radon levels can be reduced through mitigation. CLICK HERE to order a test kit. 1 Select a licensed or certified radon mitigation contractor to reduce the radon levels. 2 Have mitigation contractor determine appropriate radon reduction method.</code> |
|
| 489 |
+
| <code>how many calories is an einstein bagel</code> | <code>1 Calories In Einstein Bagel Strawberry Balsamic Vinaigrette. 144 calories, 14g fat, 5g carbs, 0g protein, 1g fiber. Calories In Einstein Bagel Strawberry Chicken Salad. 397 calories, 10g fat, 20g carbs, 56g protein, 4g fiber.</code> | <code>There are 110 calories in a 1 bagel serving of Thomas' Bagel Thins - 100% Whole Wheat. Calorie breakdown: 7% fat, 74% carbs, 19% protein.</code> |
|
| 490 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 491 |
```json
|
| 492 |
{
|
|
|
|
| 645 |
### Training Logs
|
| 646 |
| Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
|
| 647 |
|:------:|:----:|:-------------:|:---------------:|:--------------------------:|:---------------------:|:----------------------------:|
|
| 648 |
+
| 0 | 0 | - | 4.1735 | 0.6259 | 0.6583 | 0.6421 |
|
| 649 |
+
| 0.3556 | 250 | 4.3112 | 3.9428 | 0.6062 | 0.6473 | 0.6268 |
|
| 650 |
+
| 0.7112 | 500 | 3.8846 | 3.1810 | 0.5977 | 0.6251 | 0.6114 |
|
| 651 |
+
| 1.0669 | 750 | 2.9479 | 1.8567 | 0.5628 | 0.5082 | 0.5355 |
|
| 652 |
+
| 1.4225 | 1000 | 1.9026 | 1.2819 | 0.5229 | 0.4809 | 0.5019 |
|
| 653 |
+
| 1.7781 | 1250 | 1.5448 | 1.1522 | 0.5112 | 0.4845 | 0.4978 |
|
| 654 |
+
| 2.1337 | 1500 | 1.4335 | 1.1065 | 0.5183 | 0.4899 | 0.5041 |
|
| 655 |
+
| 2.4893 | 1750 | 1.3748 | 1.0820 | 0.5231 | 0.4896 | 0.5064 |
|
| 656 |
+
| 2.8450 | 2000 | 1.3445 | 1.0667 | 0.5184 | 0.4962 | 0.5073 |
|
| 657 |
+
| 3.2006 | 2250 | 1.3173 | 1.0577 | 0.5231 | 0.4971 | 0.5101 |
|
| 658 |
+
| 3.5562 | 2500 | 1.3099 | 1.0517 | 0.5165 | 0.4999 | 0.5082 |
|
| 659 |
+
| 3.9118 | 2750 | 1.2945 | 1.0483 | 0.5218 | 0.4999 | 0.5108 |
|
| 660 |
+
| 4.2674 | 3000 | 1.288 | 1.0476 | 0.5223 | 0.5001 | 0.5112 |
|
| 661 |
|
| 662 |
|
| 663 |
### Framework Versions
|