sileod commited on
Commit
a98d39f
·
verified ·
1 Parent(s): 58de7c2

Add new SentenceTransformer model

Browse files
Files changed (2) hide show
  1. README.md +152 -143
  2. model.safetensors +1 -1
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  - feature-extraction
8
  - dense
9
  - generated_from_trainer
10
- - dataset_size:6331245
11
  - loss:AnglELoss
12
  - loss:CoSENTLoss
13
  - loss:CachedMultipleNegativesRankingLoss
@@ -37,39 +37,41 @@ widget:
37
  \ pediatrician, or paediatrician. The word pediatrics and its cognates mean healer\
38
  \ of children; they derive from two Greek words: Ï\x80αá¿\x96Ï\x82 (pais child)\
39
  \ and ἰαÏ\x84Ï\x81Ï\x8CÏ\x82 (iatros doctor, healer)."
40
- - source_sentence: These ancient rites are rarely performed in contemporary Sri Lanka
41
- , but the conserved songs are still performed by folk musicians .
42
  sentences:
43
- - In 1971 , a main campus was completed in 33 MacDonnell Road for the new school
 
 
 
 
44
  .
45
- - These ancient rites are still performed in contemporary Sri Lanka , but the preserved
46
- songs are rarely performed by folk musicians .
47
- - After May 4 , 2012 , Gordon M. Snow was replaced by Joseph M. Demarest and then
48
- Michael S. Welch with limited formal announcement .
49
- - source_sentence: A woman is playing the flute.
50
  sentences:
51
- - A boy is playing the trumpet.
52
- - A man tries to read the paper.
53
- - A man is playing the guitar.
54
- - source_sentence: Interference now on all our scans.
55
  sentences:
56
- - Would you permit me to explain this Polly?
57
- - All Ourscans are jammed.
58
- - The aircraft family was first introduced at the Paris Air Show in 1999.
59
- - source_sentence: why has chs invested in da?
 
 
60
  sentences:
61
- - In order to renew the strategic road map to CHS's growth, CHS partnered with DA
62
- to improve outcomes rather than increasing its size. Most of DA's capacity was
63
- used to provide tools in order to support CHS-affiliated hospitals in delivering
64
- best-in-class healthcare to patients.
65
- - You can in theory add every enchantment that is compatible with a tool/weapon/armor
66
- onto the same item. The bow can have these 7 enchantments, though mending and
67
- infinity are mutually exclusive. So you can have up to 6 different enchantments
68
- on a bow using an anvil.
69
- - 'Clean up is a phrasal verb which means: to make (a room or space) clean and orderly.
70
- ... Clean out is a phrasal verb which means something such as a cupboard, room,
71
- or container, you take everything out of it and clean the inside of it thoroughly.
72
- Secondly, "clean"is a simple word which is often used in our daily life.'
73
  datasets:
74
  - google-research-datasets/paws
75
  - nyu-mll/glue
@@ -151,12 +153,12 @@ from sentence_transformers import SentenceTransformer
151
  model = SentenceTransformer("tasksource/ettin-32m-embed")
152
  # Run inference
153
  queries = [
154
- "why has chs invested in da?",
155
  ]
156
  documents = [
157
- "In order to renew the strategic road map to CHS's growth, CHS partnered with DA to improve outcomes rather than increasing its size. Most of DA's capacity was used to provide tools in order to support CHS-affiliated hospitals in delivering best-in-class healthcare to patients.",
158
- 'You can in theory add every enchantment that is compatible with a tool/weapon/armor onto the same item. The bow can have these 7 enchantments, though mending and infinity are mutually exclusive. So you can have up to 6 different enchantments on a bow using an anvil.',
159
- 'Clean up is a phrasal verb which means: to make (a room or space) clean and orderly. ... Clean out is a phrasal verb which means something such as a cupboard, room, or container, you take everything out of it and clean the inside of it thoroughly. Secondly, "clean"is a simple word which is often used in our daily life.',
160
  ]
161
  query_embeddings = model.encode_query(queries)
162
  document_embeddings = model.encode_document(documents)
@@ -166,7 +168,7 @@ print(query_embeddings.shape, document_embeddings.shape)
166
  # Get the similarity scores for the embeddings
167
  similarities = model.similarity(query_embeddings, document_embeddings)
168
  print(similarities)
169
- # tensor([[ 0.5738, 0.0240, -0.0787]])
170
  ```
171
 
172
  <!--
@@ -213,19 +215,19 @@ You can finetune this model on your own dataset.
213
  #### paws/labeled_final
214
 
215
  * Dataset: [paws/labeled_final](https://huggingface.co/datasets/paws) at [161ece9](https://huggingface.co/datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
216
- * Size: 49,401 training samples
217
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
218
  * Approximate statistics based on the first 1000 samples:
219
- | | sentence1 | sentence2 | label |
220
- |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
221
- | type | string | string | int |
222
- | details | <ul><li>min: 10 tokens</li><li>mean: 27.44 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 27.44 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>0: ~55.60%</li><li>1: ~44.40%</li></ul> |
223
  * Samples:
224
- | sentence1 | sentence2 | label |
225
- |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
226
- | <code>In Paris , in October 1560 , he secretly met the English ambassador , Nicolas Throckmorton , asking him for a passport to return to England through Scotland .</code> | <code>In October 1560 , he secretly met with the English ambassador , Nicolas Throckmorton , in Paris , and asked him for a passport to return to Scotland through England .</code> | <code>0</code> |
227
- | <code>The NBA season of 1975 -- 76 was the 30th season of the National Basketball Association .</code> | <code>The 1975 -- 76 season of the National Basketball Association was the 30th season of the NBA .</code> | <code>1</code> |
228
- | <code>There are also specific discussions , public profile debates and project discussions .</code> | <code>There are also public discussions , profile specific discussions , and project discussions .</code> | <code>0</code> |
229
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
230
  ```json
231
  {
@@ -239,19 +241,19 @@ You can finetune this model on your own dataset.
239
  #### glue/mrpc
240
 
241
  * Dataset: [glue/mrpc](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
242
- * Size: 3,668 training samples
243
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
244
  * Approximate statistics based on the first 1000 samples:
245
  | | sentence1 | sentence2 | label |
246
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
247
  | type | string | string | int |
248
- | details | <ul><li>min: 10 tokens</li><li>mean: 27.55 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 27.25 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>0: ~33.70%</li><li>1: ~66.30%</li></ul> |
249
  * Samples:
250
- | sentence1 | sentence2 | label |
251
- |:-----------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|:---------------|
252
- | <code>Amrozi accused his brother , whom he called " the witness " , of deliberately distorting his evidence .</code> | <code>Referring to him as only " the witness " , Amrozi accused his brother of deliberately distorting his evidence .</code> | <code>1</code> |
253
- | <code>Yucaipa owned Dominick 's before selling the chain to Safeway in 1998 for $ 2.5 billion .</code> | <code>Yucaipa bought Dominick 's in 1995 for $ 693 million and sold it to Safeway for $ 1.8 billion in 1998 .</code> | <code>0</code> |
254
- | <code>They had published an advertisement on the Internet on June 10 , offering the cargo for sale , he added .</code> | <code>On June 10 , the ship 's owners had published an advertisement on the Internet , offering the explosives for sale .</code> | <code>1</code> |
255
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
256
  ```json
257
  {
@@ -265,19 +267,19 @@ You can finetune this model on your own dataset.
265
  #### fever-evidence-related
266
 
267
  * Dataset: [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related) at [14aba00](https://huggingface.co/datasets/mwong/fever-evidence-related/tree/14aba009b5fcd97b1a9ee6f3e3b0da0e308cf7cb)
268
- * Size: 403,218 training samples
269
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
270
  * Approximate statistics based on the first 1000 samples:
271
  | | sentence1 | sentence2 | label |
272
  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
273
  | type | string | string | int |
274
- | details | <ul><li>min: 6 tokens</li><li>mean: 13.92 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 33 tokens</li><li>mean: 316.81 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>0: ~29.20%</li><li>1: ~70.80%</li></ul> |
275
  * Samples:
276
- | sentence1 | sentence2 | label |
277
- |:-----------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
278
- | <code>Nikolaj Coster-Waldau worked with the Fox Broadcasting Company.</code> | <code>Nikolaj Coster-Waldau -LRB- -LSB- neɡolaɪ kʰʌsdɐ ˈʋaldɑʊ -RSB- ; born 27 July 1970 -RRB- is a Danish actor , producer and screenwriter .. He graduated from Danish National School of Theatre in Copenhagen in 1993 .. Danish National School of Theatre. Danish National School of Theatre and Contemporary Dance. Copenhagen. Copenhagen. Coster-Waldau 's breakthrough performance in Denmark was his role in the film Nightwatch -LRB- 1994 -RRB- .. Nightwatch. Nightwatch ( 1994 film ). Since then he has appeared in numerous films in his native Scandinavia and Europe in general , including Headhunters -LRB- 2011 -RRB- and A Thousand Times Good Night -LRB- 2013 -RRB- .. Headhunters. Headhunters ( film ). A Thousand Times Good Night. A Thousand Times Good Night. In the United States , his debut film role was in the war film Black Hawk Down -LRB- 2001 -RRB- , playing Medal of Honor recipient Gary Gordon .. Black Hawk Down. Black Hawk Down ( film ). Gary Gordon. Gary Gordon. He then played Detective Jo...</code> | <code>0</code> |
279
- | <code>Nikolaj Coster-Waldau worked with the Fox Broadcasting Company.</code> | <code>Majboor -LRB- Hindi : मजबर , English : Compulsed -RRB- is a 1974 Indian Hindi crime-thriller film directed by Ravi Tandon .. Ravi Tandon. Ravi Tandon. Hindi. Hindi. crime. crime film. thriller film. thriller film. Music is by Laxmikant Pyarelal and lyrics by Anand Bakshi .. Laxmikant Pyarelal. Laxmikant Pyarelal. Anand Bakshi. Anand Bakshi. The film was written by Salim-Javed .. Salim-Javed. Salim-Javed. The movie stars Amitabh Bachchan , Parveen Babi , Pran , Madan Puri , Rehman and Farida Jalal .. Amitabh Bachchan. Amitabh Bachchan. Parveen Babi. Parveen Babi. Pran. Pran ( actor ). Farida Jalal. Farida Jalal. Madan Puri. Madan Puri. Rehman. Rehman ( actor ). It is a remake of an American film titled Zig Zag -LRB- 1970 film -RRB- starring George Kennedy The film was later remade in Telugu by director K. Raghavendra Rao as Raja -LRB- 1976 -RRB- starring Shobhan Babu and Jayasudha .. George Kennedy. George Kennedy. Telugu. Telugu language. K. Raghavendra Rao. K. Raghavendra Rao. Raja....</code> | <code>1</code> |
280
- | <code>Nikolaj Coster-Waldau worked with the Fox Broadcasting Company.</code> | <code>The small snakehead ' -LRB- Channa asiatica -RRB- is a species of snakehead .. Channa. Channa. snakehead. Channidae. It is one of four species of the genus Channa '' native to China .. Channa. Channa. China. China. It also can be found in Taiwan and southern Japan , to which it migrated -LRB- or was introduced -RRB- .. Taiwan. Taiwan. Japan. Japan. It is a medium-sized snakehead which is a nestbuilder -LRB- as opposed to the Indian mouthbrooder dwarf snakeheads -RRB- .. snakehead. Channidae. mouthbrooder. mouthbrooder</code> | <code>1</code> |
281
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
282
  ```json
283
  {
@@ -291,19 +293,19 @@ You can finetune this model on your own dataset.
291
  #### parade
292
 
293
  * Dataset: [parade](https://huggingface.co/datasets/tasksource/parade) at [466978f](https://huggingface.co/datasets/tasksource/parade/tree/466978f31aebf4d052287f32ea3ae393f178f386)
294
- * Size: 7,550 training samples
295
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
296
  * Approximate statistics based on the first 1000 samples:
297
  | | sentence1 | sentence2 | label |
298
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
299
  | type | string | string | int |
300
- | details | <ul><li>min: 6 tokens</li><li>mean: 21.97 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 21.81 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>0: ~57.10%</li><li>1: ~42.90%</li></ul> |
301
  * Samples:
302
- | sentence1 | sentence2 | label |
303
- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
304
- | <code>predictive models are involved with predicting a value based on other values in the dataset. the process of training a predictive model is known as supervised learning.</code> | <code>predict a value based on other values in the dataset. process of training a pred model is supervised learning.</code> | <code>1</code> |
305
- | <code>predict a value based on other values in the dataset. process of training a pred model is supervised learning.</code> | <code>involved with predicting a value based on other values in the dataset; process of training this type of model is known as supervised learning</code> | <code>1</code> |
306
- | <code>predicting one value (the target variable) using other values</code> | <code>predictive models are involved with predicting a value based on other values in the dataset.</code> | <code>1</code> |
307
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
308
  ```json
309
  {
@@ -317,19 +319,19 @@ You can finetune this model on your own dataset.
317
  #### apt
318
 
319
  * Dataset: [apt](https://huggingface.co/datasets/tasksource/apt) at [f6c07f6](https://huggingface.co/datasets/tasksource/apt/tree/f6c07f66d3eccebd36418885ce10aff295d436dd)
320
- * Size: 3,349 training samples
321
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
322
  * Approximate statistics based on the first 1000 samples:
323
  | | sentence1 | sentence2 | label |
324
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
325
  | type | string | string | int |
326
- | details | <ul><li>min: 4 tokens</li><li>mean: 17.28 tokens</li><li>max: 124 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 16.99 tokens</li><li>max: 121 tokens</li></ul> | <ul><li>0: ~35.90%</li><li>1: ~64.10%</li></ul> |
327
  * Samples:
328
- | sentence1 | sentence2 | label |
329
- |:-------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:---------------|
330
- | <code>Come on.</code> | <code>Come on</code> | <code>1</code> |
331
- | <code>In Washington, the federal government remained closed for a second day.</code> | <code>The federal government in Washington was closed for a second day running.</code> | <code>1</code> |
332
- | <code>The findings appear in next Friday's Physical Review Letters.</code> | <code>Results published next Friday</code> | <code>0</code> |
333
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
334
  ```json
335
  {
@@ -343,19 +345,19 @@ You can finetune this model on your own dataset.
343
  #### glue/stsb
344
 
345
  * Dataset: [glue/stsb](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
346
- * Size: 5,749 training samples
347
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
348
  * Approximate statistics based on the first 1000 samples:
349
  | | sentence1 | sentence2 | label |
350
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
351
  | type | string | string | float |
352
- | details | <ul><li>min: 6 tokens</li><li>mean: 10.16 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 10.12 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.23</li><li>max: 5.0</li></ul> |
353
  * Samples:
354
- | sentence1 | sentence2 | label |
355
- |:-----------------------------------------------------------|:----------------------------------------------------------------------|:-------------------------------|
356
- | <code>A plane is taking off.</code> | <code>An air plane is taking off.</code> | <code>5.0</code> |
357
- | <code>A man is playing a large flute.</code> | <code>A man is playing a flute.</code> | <code>3.799999952316284</code> |
358
- | <code>A man is spreading shreded cheese on a pizza.</code> | <code>A man is spreading shredded cheese on an uncooked pizza.</code> | <code>3.799999952316284</code> |
359
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
360
  ```json
361
  {
@@ -369,19 +371,19 @@ You can finetune this model on your own dataset.
369
  #### sick/relatedness
370
 
371
  * Dataset: sick/relatedness
372
- * Size: 4,439 training samples
373
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
374
  * Approximate statistics based on the first 1000 samples:
375
  | | sentence1 | sentence2 | label |
376
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
377
  | type | string | string | float |
378
- | details | <ul><li>min: 6 tokens</li><li>mean: 12.66 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.46 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.41</li><li>max: 5.0</li></ul> |
379
  * Samples:
380
- | sentence1 | sentence2 | label |
381
- |:--------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-------------------------------|
382
- | <code>A group of kids is playing in a yard and an old man is standing in the background</code> | <code>A group of boys in a yard is playing and a man is standing in the background</code> | <code>4.5</code> |
383
- | <code>A group of children is playing in the house and there is no man standing in the background</code> | <code>A group of kids is playing in a yard and an old man is standing in the background</code> | <code>3.200000047683716</code> |
384
- | <code>The young boys are playing outdoors and the man is smiling nearby</code> | <code>The kids are playing outdoors near a man with a smile</code> | <code>4.699999809265137</code> |
385
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
386
  ```json
387
  {
@@ -395,19 +397,19 @@ You can finetune this model on your own dataset.
395
  #### sts-companion
396
 
397
  * Dataset: [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) at [fd8beff](https://huggingface.co/datasets/tasksource/sts-companion/tree/fd8beffb788df5f6673bc688e6dcbe3690a3acc6)
398
- * Size: 4,760 training samples
399
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
400
  * Approximate statistics based on the first 1000 samples:
401
- | | label | sentence1 | sentence2 |
402
- |:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
403
- | type | float | string | string |
404
- | details | <ul><li>min: 0.0</li><li>mean: 3.09</li><li>max: 5.0</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.91 tokens</li><li>max: 91 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 17.28 tokens</li><li>max: 83 tokens</li></ul> |
405
  * Samples:
406
- | label | sentence1 | sentence2 |
407
- |:-----------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------|
408
- | <code>1.6</code> | <code>this lus in this frame refer to biological entities labeled by the fe organism. an organism is described as something that can be alive, or have naturally occuring biological processes and functions, however the concept of life is often used metaphorically for non-organic entities which resemble or act as if they have organic life.</code> | <code>living things collectively;</code> |
409
- | <code>3.8</code> | <code>Washington's Economic Boom, Financed by You Real life "Hunger Games"</code> | <code>Washington?s Economic Boom, Financed by You</code> |
410
- | <code>4.4</code> | <code>Knowledge of foreign languages is accepted as a necessary precursor to mobility.</code> | <code>It is accepted that knowledge of foreign languages is a necessary precondition to mobility.</code> |
411
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
412
  ```json
413
  {
@@ -421,19 +423,19 @@ You can finetune this model on your own dataset.
421
  #### zero-shot-label-nli
422
 
423
  * Dataset: [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli) at [ee693db](https://huggingface.co/datasets/tasksource/zero-shot-label-nli/tree/ee693dba923b5d5484aa9232b7357c5e45dd39b8)
424
- * Size: 800,000 training samples
425
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
426
  * Approximate statistics based on the first 1000 samples:
427
  | | label | sentence1 | sentence2 |
428
  |:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
429
  | type | int | string | string |
430
- | details | <ul><li>0: ~51.20%</li><li>1: ~48.80%</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 62.72 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 8.01 tokens</li><li>max: 16 tokens</li></ul> |
431
  * Samples:
432
- | label | sentence1 | sentence2 |
433
- |:---------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------|
434
- | <code>0</code> | <code>How can you build website like Facebook?<br>How do you make a site like Facebook?</code> | <code>This example is not_duplicate.</code> |
435
- | <code>0</code> | <code>Warren Buffet was born on August 30 , 1932 .<br>Warren Edward Buffett -LRB- -LSB- ˈbʌfᵻt -RSB- born August 30 , 1930 -RRB- is an American business magnate , investor , and philanthropist .</code> | <code>This example is SUPPORTS.</code> |
436
- | <code>0</code> | <code>raise : Raise a siege. :<br>raise : The President raised several million dollars for his college. :</code> | <code>This example is True.</code> |
437
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
438
  ```json
439
  {
@@ -726,11 +728,11 @@ You can finetune this model on your own dataset.
726
  ### Training Hyperparameters
727
  #### Non-Default Hyperparameters
728
 
729
- - `per_device_train_batch_size`: 384
730
- - `learning_rate`: 0.0001
731
- - `weight_decay`: 1e-06
732
  - `num_train_epochs`: 1
733
- - `warmup_ratio`: 0.1
734
  - `fp16`: True
735
  - `gradient_checkpointing`: True
736
  - `torch_compile`: True
@@ -743,15 +745,15 @@ You can finetune this model on your own dataset.
743
  - `do_predict`: False
744
  - `eval_strategy`: no
745
  - `prediction_loss_only`: True
746
- - `per_device_train_batch_size`: 384
747
  - `per_device_eval_batch_size`: 8
748
  - `per_gpu_train_batch_size`: None
749
  - `per_gpu_eval_batch_size`: None
750
  - `gradient_accumulation_steps`: 1
751
  - `eval_accumulation_steps`: None
752
  - `torch_empty_cache_steps`: None
753
- - `learning_rate`: 0.0001
754
- - `weight_decay`: 1e-06
755
  - `adam_beta1`: 0.9
756
  - `adam_beta2`: 0.999
757
  - `adam_epsilon`: 1e-08
@@ -760,7 +762,7 @@ You can finetune this model on your own dataset.
760
  - `max_steps`: -1
761
  - `lr_scheduler_type`: linear
762
  - `lr_scheduler_kwargs`: {}
763
- - `warmup_ratio`: 0.1
764
  - `warmup_steps`: 0
765
  - `log_level`: passive
766
  - `log_level_replica`: warning
@@ -864,38 +866,45 @@ You can finetune this model on your own dataset.
864
  ### Training Logs
865
  | Epoch | Step | Training Loss |
866
  |:------:|:-----:|:-------------:|
867
- | 0.0303 | 500 | 4.8473 |
868
- | 0.0606 | 1000 | 2.6754 |
869
- | 0.0909 | 1500 | 2.6358 |
870
- | 0.1212 | 2000 | 2.619 |
871
- | 0.1515 | 2500 | 2.8342 |
872
- | 0.1818 | 3000 | 2.2872 |
873
- | 0.2121 | 3500 | 2.2727 |
874
- | 0.2424 | 4000 | 2.3469 |
875
- | 0.2727 | 4500 | 2.1085 |
876
- | 0.3030 | 5000 | 2.2076 |
877
- | 0.3334 | 5500 | 2.1161 |
878
- | 0.3637 | 6000 | 2.2332 |
879
- | 0.3940 | 6500 | 2.1574 |
880
- | 0.4243 | 7000 | 2.1012 |
881
- | 0.4546 | 7500 | 1.946 |
882
- | 0.4849 | 8000 | 1.7233 |
883
- | 0.5152 | 8500 | 2.4444 |
884
- | 0.5455 | 9000 | 2.1055 |
885
- | 0.5758 | 9500 | 1.9107 |
886
- | 0.6061 | 10000 | 2.0212 |
887
- | 0.6364 | 10500 | 2.1029 |
888
- | 0.6667 | 11000 | 1.8484 |
889
- | 0.6970 | 11500 | 2.1658 |
890
- | 0.7273 | 12000 | 2.1007 |
891
- | 0.7576 | 12500 | 1.9194 |
892
- | 0.7879 | 13000 | 1.6709 |
893
- | 0.8182 | 13500 | 1.7653 |
894
- | 0.8485 | 14000 | 1.952 |
895
- | 0.8788 | 14500 | 1.8437 |
896
- | 0.9091 | 15000 | 1.6667 |
897
- | 0.9395 | 15500 | 1.7433 |
898
- | 0.9698 | 16000 | 1.7623 |
 
 
 
 
 
 
 
899
 
900
 
901
  ### Framework Versions
 
7
  - feature-extraction
8
  - dense
9
  - generated_from_trainer
10
+ - dataset_size:7176192
11
  - loss:AnglELoss
12
  - loss:CoSENTLoss
13
  - loss:CachedMultipleNegativesRankingLoss
 
37
  \ pediatrician, or paediatrician. The word pediatrics and its cognates mean healer\
38
  \ of children; they derive from two Greek words: Ï\x80αá¿\x96Ï\x82 (pais child)\
39
  \ and ἰαÏ\x84Ï\x81Ï\x8CÏ\x82 (iatros doctor, healer)."
40
+ - source_sentence: Creek Township borders Elsinboro Township , Pennsville Township
41
+ and Salem .
42
  sentences:
43
+ - Today , Galesburg-Augusta Community Schools consists of a primary school and a
44
+ high school in Galesburg and a middle school in Augusta .
45
+ - Elsinboro Township borders with the Lower Alloways Creek Township , Pennsville
46
+ Township and Salem .
47
+ - In 1953 , he married the actress Gilda Neeltje , sister of the actress Diane Holland
48
  .
49
+ - source_sentence: A man is riding on one wheel on a motorcycle.
 
 
 
 
50
  sentences:
51
+ - A person is performing tricks on a motorcycle.
52
+ - A boy jumping in the air on the beach.
53
+ - A woman is pouring ingredients into a frying pan.
54
+ - source_sentence: '''Why don''t you find out?'
55
  sentences:
56
+ - He is suggesting that the lack of effort focusing on the concept is making it
57
+ seem unrealistic.
58
+ - The military stated that the 244th Engineer Battalion has been handling the construction
59
+ of playgrounds, cleaning up the rubble and restoring irrigation services in Iraq.
60
+ - Why you haven't find out?.
61
+ - source_sentence: what are the three subatomic particles called?
62
  sentences:
63
+ - Subatomic particles include electrons, the negatively charged, almost massless
64
+ particles that nevertheless account for most of the size of the atom, and they
65
+ include the heavier building blocks of the small but very dense nucleus of the
66
+ atom, the positively charged protons and the electrically neutral neutrons.
67
+ - Your body needs cholesterol to build healthy cells, but high levels of cholesterol
68
+ can increase your risk of heart disease. With high cholesterol, you can develop
69
+ fatty deposits in your blood vessels. Eventually, these deposits grow, making
70
+ it difficult for enough blood to flow through your arteries.
71
+ - 'If you experience any of the following symptoms, stop taking ibuprofen and call
72
+ your doctor: stomach pain, heartburn, vomit that is bloody or looks like coffee
73
+ grounds, blood in the stool, or black and tarry stools. Keep all appointments
74
+ with your doctor and the laboratory.'
75
  datasets:
76
  - google-research-datasets/paws
77
  - nyu-mll/glue
 
153
  model = SentenceTransformer("tasksource/ettin-32m-embed")
154
  # Run inference
155
  queries = [
156
+ "what are the three subatomic particles called?",
157
  ]
158
  documents = [
159
+ 'Subatomic particles include electrons, the negatively charged, almost massless particles that nevertheless account for most of the size of the atom, and they include the heavier building blocks of the small but very dense nucleus of the atom, the positively charged protons and the electrically neutral neutrons.',
160
+ 'Your body needs cholesterol to build healthy cells, but high levels of cholesterol can increase your risk of heart disease. With high cholesterol, you can develop fatty deposits in your blood vessels. Eventually, these deposits grow, making it difficult for enough blood to flow through your arteries.',
161
+ 'If you experience any of the following symptoms, stop taking ibuprofen and call your doctor: stomach pain, heartburn, vomit that is bloody or looks like coffee grounds, blood in the stool, or black and tarry stools. Keep all appointments with your doctor and the laboratory.',
162
  ]
163
  query_embeddings = model.encode_query(queries)
164
  document_embeddings = model.encode_document(documents)
 
168
  # Get the similarity scores for the embeddings
169
  similarities = model.similarity(query_embeddings, document_embeddings)
170
  print(similarities)
171
+ # tensor([[ 0.6600, -0.0148, 0.0229]])
172
  ```
173
 
174
  <!--
 
215
  #### paws/labeled_final
216
 
217
  * Dataset: [paws/labeled_final](https://huggingface.co/datasets/paws) at [161ece9](https://huggingface.co/datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
218
+ * Size: 148,203 training samples
219
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
220
  * Approximate statistics based on the first 1000 samples:
221
+ | | sentence1 | sentence2 | label |
222
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
223
+ | type | string | string | int |
224
+ | details | <ul><li>min: 11 tokens</li><li>mean: 27.65 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 27.73 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>0: ~57.50%</li><li>1: ~42.50%</li></ul> |
225
  * Samples:
226
+ | sentence1 | sentence2 | label |
227
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
228
+ | <code>Ceremonial music ( `` rokon fada '' ) is listed as a status symbol , and musicians are generally chosen for political reasons as opposed to musical ones .</code> | <code>Ceremonial music ( `` rokon fada '' ) is performed as a status symbol , and musicians are generally chosen for musical reasons as opposed to political ones .</code> | <code>0</code> |
229
+ | <code>In 1989 he travelled to South Africa , Johannesburg and Angola , Mozambique on a peace-seeking mission .</code> | <code>In 1989 , he traveled to Mozambique , Johannesburg , and Angola , South Africa on a peace-seeking mission .</code> | <code>1</code> |
230
+ | <code>In this way , the Nestorian faith was established in the East under tragic signs .</code> | <code>In this way , under Nestorian auspices , the tragic faith was established in the East .</code> | <code>0</code> |
231
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
232
  ```json
233
  {
 
241
  #### glue/mrpc
242
 
243
  * Dataset: [glue/mrpc](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
244
+ * Size: 11,004 training samples
245
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
246
  * Approximate statistics based on the first 1000 samples:
247
  | | sentence1 | sentence2 | label |
248
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
249
  | type | string | string | int |
250
+ | details | <ul><li>min: 11 tokens</li><li>mean: 27.23 tokens</li><li>max: 52 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 27.29 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>0: ~33.10%</li><li>1: ~66.90%</li></ul> |
251
  * Samples:
252
+ | sentence1 | sentence2 | label |
253
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
254
+ | <code>Tony Blair has taken a hardline stance arguing nothing should be done to lessen the pressure on Mugabe at the gathering in the capital Abuja .</code> | <code>The Prime Minister has taken a hardline stance arguing nothing should be done to lessen the pressure on Mugabe .</code> | <code>0</code> |
255
+ | <code>The identical rovers will act as robotic geologists , searching for evidence of past water .</code> | <code>The rovers act as robotic geologists , moving on six wheels .</code> | <code>0</code> |
256
+ | <code>" We make no apologies for finding every legal way possible to protect the American public from further terrorist attack , " Barbara Comstock said .</code> | <code>" We make no apologies for finding every legal way possible to protect the American public from further terrorist attacks , " said Barbara Comstock , Ashcroft 's press secretary .</code> | <code>1</code> |
257
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
258
  ```json
259
  {
 
267
  #### fever-evidence-related
268
 
269
  * Dataset: [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related) at [14aba00](https://huggingface.co/datasets/mwong/fever-evidence-related/tree/14aba009b5fcd97b1a9ee6f3e3b0da0e308cf7cb)
270
+ * Size: 800,000 training samples
271
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
272
  * Approximate statistics based on the first 1000 samples:
273
  | | sentence1 | sentence2 | label |
274
  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
275
  | type | string | string | int |
276
+ | details | <ul><li>min: 7 tokens</li><li>mean: 13.65 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 28 tokens</li><li>mean: 318.06 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>0: ~30.20%</li><li>1: ~69.80%</li></ul> |
277
  * Samples:
278
+ | sentence1 | sentence2 | label |
279
+ |:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
280
+ | <code>Batman: The Killing Joke features characters.</code> | <code>notice. Cantonese Pinyin -LRB- , also known as 教院式拼音方案 -RRB- is a romanization system for Cantonese developed by Rev. Yu Ping Chiu in 1971 , and subsequently modified by the Education Department -LRB- merged into the Education and Manpower Bureau since 2003 -RRB- of Hong Kong and Prof. Zhan Bohui of the Chinese Dialects Research Centre of the Jinan University , Guangdong , PRC , and honorary professor of the School of Chinese , University of Hong Kong .. romanization. romanization. Cantonese. Cantonese. Education and Manpower Bureau. Education and Manpower Bureau. Zhan Bohui. Zhan Bohui. It is the only romanization system accepted by Education and Manpower Bureau of Hong Kong and Hong Kong Examinations and Assessment Authority .. romanization. romanization. Education and Manpower Bureau. Education and Manpower Bureau. Hong Kong Examinations and Assessment Authority. Hong Kong Examinations and Assessment Authority. The formal and short forms of the system 's Chinese names mean respectiv...</code> | <code>1</code> |
281
+ | <code>Jon Snow is played by a person.</code> | <code>Cao'an is a temple in Jinjiang , Fujian .. Originally constructed by Chinese Manicheans , it was viewed by later worshipers as a Buddhist temple .. Manicheans. Manichaeism. This `` Manichean temple in Buddhist disguise ''. is seen by modern experts on Manichaeism as `` the only extant Manichean temple in China '' , or `` the only Manichean building which has survived intact '' .</code> | <code>1</code> |
282
+ | <code>Scotland includes islands.</code> | <code>Scotland -LRB- -LSB- ˈskɒt.lənd -RSB- Scots : -LSB- - scoˈskɔt.lənd -RSB- Alba -LSB- ˈalˠapə -RSB- -RRB- is a country that is part of the United Kingdom and covers the northern third of the island of Great Britain .. Scots. Scots language. Scotland. Scots Law. Alba. Alba. country. country. part. Countries of the United Kingdom. United Kingdom. United Kingdom. Great Britain. Great Britain. It shares a border with England to the south , and is otherwise surrounded by the Atlantic Ocean , with the North Sea to the east and the North Channel and Irish Sea to the south-west .. England. England. Atlantic Ocean. Atlantic Ocean. North Sea. North Sea. North Channel. North Channel ( British Isles ). Irish Sea. Irish Sea. In addition to the mainland , the country is made up of more than 790 islands , including the Northern Isles and the Hebrides .. country. country. Northern Isles. Northern Isles. Hebrides. Hebrides. The Kingdom of Scotland emerged as an independent sovereign state in the Early ...</code> | <code>0</code> |
283
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
284
  ```json
285
  {
 
293
  #### parade
294
 
295
  * Dataset: [parade](https://huggingface.co/datasets/tasksource/parade) at [466978f](https://huggingface.co/datasets/tasksource/parade/tree/466978f31aebf4d052287f32ea3ae393f178f386)
296
+ * Size: 22,650 training samples
297
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
298
  * Approximate statistics based on the first 1000 samples:
299
  | | sentence1 | sentence2 | label |
300
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
301
  | type | string | string | int |
302
+ | details | <ul><li>min: 6 tokens</li><li>mean: 22.21 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 21.48 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>0: ~54.80%</li><li>1: ~45.20%</li></ul> |
303
  * Samples:
304
+ | sentence1 | sentence2 | label |
305
+ |:---------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
306
+ | <code>access to device itself application specific data (network services, dns, html, http, etc)</code> | <code>(upper layer data)facilitates communication between such programs and lower-layer network services. high-level apis, including resource sharing, remote file access.</code> | <code>0</code> |
307
+ | <code>an important element of information management, but it is just one part of a larger whole</code> | <code>converting facts and figures into useful information</code> | <code>0</code> |
308
+ | <code>web site that has a field for you to type in a search query, as it will search the internet for you using your search criteria.</code> | <code>web-based search tool that locates a web page using a keyword</code> | <code>1</code> |
309
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
310
  ```json
311
  {
 
319
  #### apt
320
 
321
  * Dataset: [apt](https://huggingface.co/datasets/tasksource/apt) at [f6c07f6](https://huggingface.co/datasets/tasksource/apt/tree/f6c07f66d3eccebd36418885ce10aff295d436dd)
322
+ * Size: 10,047 training samples
323
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
324
  * Approximate statistics based on the first 1000 samples:
325
  | | sentence1 | sentence2 | label |
326
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
327
  | type | string | string | int |
328
+ | details | <ul><li>min: 4 tokens</li><li>mean: 17.32 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 16.46 tokens</li><li>max: 121 tokens</li></ul> | <ul><li>0: ~35.80%</li><li>1: ~64.20%</li></ul> |
329
  * Samples:
330
+ | sentence1 | sentence2 | label |
331
+ |:------------------------------------------------------------------|:-------------------------------------------------------------------------|:---------------|
332
+ | <code>Watch out.</code> | <code>U.S. Bank</code> | <code>0</code> |
333
+ | <code>Oh! we spent all night, used all the fancy machines.</code> | <code>We spent all night using the luxurious equipment.</code> | <code>1</code> |
334
+ | <code>I'm willing to give you all this information...</code> | <code>This information, all of it, I'm inclined to provide you...</code> | <code>1</code> |
335
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
336
  ```json
337
  {
 
345
  #### glue/stsb
346
 
347
  * Dataset: [glue/stsb](https://huggingface.co/datasets/glue) at [bcdcba7](https://huggingface.co/datasets/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
348
+ * Size: 17,247 training samples
349
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
350
  * Approximate statistics based on the first 1000 samples:
351
  | | sentence1 | sentence2 | label |
352
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
353
  | type | string | string | float |
354
+ | details | <ul><li>min: 6 tokens</li><li>mean: 14.68 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.84 tokens</li><li>max: 68 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.64</li><li>max: 5.0</li></ul> |
355
  * Samples:
356
+ | sentence1 | sentence2 | label |
357
+ |:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------|:--------------------------------|
358
+ | <code>Mandela's condition has 'improved'</code> | <code>Mandela's condition has 'worsened over past 48 hours'</code> | <code>1.0</code> |
359
+ | <code>the cfe is very important for european security.</code> | <code>the cfe is a cornerstone of european security.</code> | <code>5.0</code> |
360
+ | <code>The Nasdaq fell about 1.3% for the month, snapping a seven-month winning streak.</code> | <code>The Nasdaq is down roughly 0.4 percent for the month, on track to snap a 7-month streak of gains.</code> | <code>2.4000000953674316</code> |
361
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
362
  ```json
363
  {
 
371
  #### sick/relatedness
372
 
373
  * Dataset: sick/relatedness
374
+ * Size: 13,317 training samples
375
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
376
  * Approximate statistics based on the first 1000 samples:
377
  | | sentence1 | sentence2 | label |
378
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
379
  | type | string | string | float |
380
+ | details | <ul><li>min: 6 tokens</li><li>mean: 12.25 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.11 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.51</li><li>max: 5.0</li></ul> |
381
  * Samples:
382
+ | sentence1 | sentence2 | label |
383
+ |:------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------|
384
+ | <code>A cold cyclist is celebrating</code> | <code>A bike is being held over his head by a bicyclist in a group of people</code> | <code>2.299999952316284</code> |
385
+ | <code>Nobody is cutting a capsicum into pieces</code> | <code>The person is slicing a clove of garlic into pieces</code> | <code>3.0999999046325684</code> |
386
+ | <code>A woman is not cutting shrimps</code> | <code>A man is chopping butter into a container</code> | <code>1.7999999523162842</code> |
387
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
388
  ```json
389
  {
 
397
  #### sts-companion
398
 
399
  * Dataset: [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) at [fd8beff](https://huggingface.co/datasets/tasksource/sts-companion/tree/fd8beffb788df5f6673bc688e6dcbe3690a3acc6)
400
+ * Size: 14,280 training samples
401
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
402
  * Approximate statistics based on the first 1000 samples:
403
+ | | label | sentence1 | sentence2 |
404
+ |:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
405
+ | type | float | string | string |
406
+ | details | <ul><li>min: 0.0</li><li>mean: 3.13</li><li>max: 5.0</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 18.95 tokens</li><li>max: 91 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 17.55 tokens</li><li>max: 269 tokens</li></ul> |
407
  * Samples:
408
+ | label | sentence1 | sentence2 |
409
+ |:-----------------|:-----------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------|
410
+ | <code>4.2</code> | <code>I am calling BS!!! NYTimes: Morsi Says His Slurs of Jews Were Taken Out of Context</code> | <code>Morsi Says Slurs of Jews Were Taken Out of Context</code> |
411
+ | <code>3.0</code> | <code>The driver of the coach tried to avoid it by swerving hard, but still grazed the right side of the lorry.</code> | <code>The driver of the last to try to avoid it through a sudden move, but he fell short by his right side.</code> |
412
+ | <code>5.0</code> | <code>create a mess or disorder</code> | <code>make a mess of or create disorder in.</code> |
413
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
414
  ```json
415
  {
 
423
  #### zero-shot-label-nli
424
 
425
  * Dataset: [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli) at [ee693db](https://huggingface.co/datasets/tasksource/zero-shot-label-nli/tree/ee693dba923b5d5484aa9232b7357c5e45dd39b8)
426
+ * Size: 1,090,333 training samples
427
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
428
  * Approximate statistics based on the first 1000 samples:
429
  | | label | sentence1 | sentence2 |
430
  |:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
431
  | type | int | string | string |
432
+ | details | <ul><li>0: ~50.70%</li><li>1: ~49.30%</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 68.51 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 7.95 tokens</li><li>max: 17 tokens</li></ul> |
433
  * Samples:
434
+ | label | sentence1 | sentence2 |
435
+ |:---------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------|
436
+ | <code>0</code> | <code>Amrozi accused his brother , whom he called " the witness " , of deliberately distorting his evidence .<br>Referring to him as only " the witness " , Amrozi accused his brother of deliberately distorting his evidence .</code> | <code>This example is not_equivalent.</code> |
437
+ | <code>1</code> | <code>Do science and religion conflict with each other?<br>Does science conflict with the Bible?</code> | <code>This example is not_duplicate.</code> |
438
+ | <code>0</code> | <code>do iran and afghanistan speak the same language</code> | <code>This example is False.</code> |
439
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
440
  ```json
441
  {
 
728
  ### Training Hyperparameters
729
  #### Non-Default Hyperparameters
730
 
731
+ - `per_device_train_batch_size`: 360
732
+ - `learning_rate`: 8e-05
733
+ - `weight_decay`: 5e-05
734
  - `num_train_epochs`: 1
735
+ - `warmup_ratio`: 0.03
736
  - `fp16`: True
737
  - `gradient_checkpointing`: True
738
  - `torch_compile`: True
 
745
  - `do_predict`: False
746
  - `eval_strategy`: no
747
  - `prediction_loss_only`: True
748
+ - `per_device_train_batch_size`: 360
749
  - `per_device_eval_batch_size`: 8
750
  - `per_gpu_train_batch_size`: None
751
  - `per_gpu_eval_batch_size`: None
752
  - `gradient_accumulation_steps`: 1
753
  - `eval_accumulation_steps`: None
754
  - `torch_empty_cache_steps`: None
755
+ - `learning_rate`: 8e-05
756
+ - `weight_decay`: 5e-05
757
  - `adam_beta1`: 0.9
758
  - `adam_beta2`: 0.999
759
  - `adam_epsilon`: 1e-08
 
762
  - `max_steps`: -1
763
  - `lr_scheduler_type`: linear
764
  - `lr_scheduler_kwargs`: {}
765
+ - `warmup_ratio`: 0.03
766
  - `warmup_steps`: 0
767
  - `log_level`: passive
768
  - `log_level_replica`: warning
 
866
  ### Training Logs
867
  | Epoch | Step | Training Loss |
868
  |:------:|:-----:|:-------------:|
869
+ | 0.0251 | 500 | 5.0537 |
870
+ | 0.0501 | 1000 | 3.6206 |
871
+ | 0.0752 | 1500 | 3.249 |
872
+ | 0.1003 | 2000 | 3.5885 |
873
+ | 0.1254 | 2500 | 3.2479 |
874
+ | 0.1504 | 3000 | 3.2033 |
875
+ | 0.1755 | 3500 | 2.7123 |
876
+ | 0.2006 | 4000 | 2.8247 |
877
+ | 0.2257 | 4500 | 2.7694 |
878
+ | 0.2507 | 5000 | 3.0215 |
879
+ | 0.2758 | 5500 | 2.6723 |
880
+ | 0.3009 | 6000 | 2.8297 |
881
+ | 0.3259 | 6500 | 2.4046 |
882
+ | 0.3510 | 7000 | 2.2289 |
883
+ | 0.3761 | 7500 | 2.4628 |
884
+ | 0.4012 | 8000 | 2.4032 |
885
+ | 0.4262 | 8500 | 2.5024 |
886
+ | 0.4513 | 9000 | 2.0948 |
887
+ | 0.4764 | 9500 | 2.4389 |
888
+ | 0.5015 | 10000 | 2.4771 |
889
+ | 0.5265 | 10500 | 2.6465 |
890
+ | 0.5516 | 11000 | 2.5892 |
891
+ | 0.5767 | 11500 | 2.3557 |
892
+ | 0.6017 | 12000 | 2.2359 |
893
+ | 0.6268 | 12500 | 2.5839 |
894
+ | 0.6519 | 13000 | 2.4216 |
895
+ | 0.6770 | 13500 | 2.3211 |
896
+ | 0.7020 | 14000 | 2.1171 |
897
+ | 0.7271 | 14500 | 2.1206 |
898
+ | 0.7522 | 15000 | 2.2557 |
899
+ | 0.7773 | 15500 | 2.2815 |
900
+ | 0.8023 | 16000 | 2.0951 |
901
+ | 0.8274 | 16500 | 2.3415 |
902
+ | 0.8525 | 17000 | 2.2792 |
903
+ | 0.8775 | 17500 | 2.3113 |
904
+ | 0.9026 | 18000 | 2.1932 |
905
+ | 0.9277 | 18500 | 2.1134 |
906
+ | 0.9528 | 19000 | 1.9995 |
907
+ | 0.9778 | 19500 | 1.8916 |
908
 
909
 
910
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7658290cf36da3d18ee7ebfc328f9c40bd49d23c22c9bf0cd9cb101c1c526c40
3
  size 127538496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d9729ed5a375cb33fdfe9941bf4032235f8e37c6b27fa88b752ff736b85616b
3
  size 127538496