File size: 32,420 Bytes
6deda58
c668dd9
 
 
6f1d702
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c668dd9
6f1d702
6deda58
 
6f1d702
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6deda58
6f1d702
 
 
6deda58
6f1d702
6deda58
c668dd9
6f1d702
 
 
 
 
c668dd9
6deda58
6f1d702
6deda58
6f1d702
 
 
6deda58
6f1d702
 
 
6deda58
6f1d702
c668dd9
6f1d702
6deda58
6f1d702
 
 
 
 
 
 
 
 
 
 
6deda58
6f1d702
 
 
 
 
6deda58
6f1d702
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6deda58
6f1d702
 
 
 
 
 
 
 
 
 
 
6deda58
6f1d702
 
6deda58
6f1d702
 
6deda58
6f1d702
 
6deda58
6f1d702
 
6deda58
6f1d702
 
6deda58
6f1d702
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:5489
- loss:MultipleNegativesRankingLoss
base_model: zacbrld/MNLP_M2_document_encoder
widget:
- source_sentence: Military activity affects the physical geology. This was first
    noted through the intensive shelling on the Western Front during World War I,
    which caused the shattering of the bedrock and changed the rocks' permeability.
    New minerals, rocks, and land-forms are also a byproduct of nuclear testing.
  sentences:
  - 'Silicon can form sigma bonds to other silicon atoms (and disilane is the parent
    of this class of compounds). However, it is difficult to prepare and isolate SinH2n+2
    (analogous to the saturated alkane hydrocarbons) with n greater than about 8,
    as their thermal stability decreases with increases in the number of silicon atoms.  Silanes
    higher in molecular weight than disilane decompose to polymeric polysilicon hydride
    and hydrogen.  But with a suitable pair of organic substituents in place of hydrogen
    on each silicon it is possible to prepare polysilanes (sometimes, erroneously
    called polysilenes) that are analogues of alkanes. These long chain compounds
    have surprising electronic properties - high electrical conductivity, for example
    - arising from sigma delocalization of the electrons in the chain.

    Even silicon–silicon pi bonds are possible. However, these bonds are less stable
    than the carbon analogues. Disilane and longer silanes are quite reactive compared
    to alkanes. Disilene and disilynes are quite rare, unlike alkenes and alkynes.
    Examples of disilynes, long thought to be too unstable to be isolated were reported
    in 2004.'
  - 'The increasing sophistication of brain-reading technologies has led many to investigate
    their potential applications for lie detection. Legally required brain scans arguably
    violate “the guarantee against self-incrimination” because they differ from acceptable
    forms of bodily evidence, such as fingerprints or blood samples, in an important
    way: they are not simply physical, hard evidence, but evidence that is intimately
    linked to the defendant''s mind. Under US law, brain-scanning technologies might
    also raise implications for the Fourth Amendment, calling into question whether
    they constitute an unreasonable search and seizure.'
  - Military activity affects the physical geology. This was first noted through the
    intensive shelling on the Western Front during World War I, which caused the shattering
    of the bedrock and changed the rocks' permeability. New minerals, rocks, and land-forms
    are also a byproduct of nuclear testing.
- source_sentence: Right after a bombing in Moscow on September 6, 1999, several anti-nuclear
    activists were detained under suspicion. Vladimir Slivyak was one of the three
    arrested under suspicion. He was an activist in the anti-nuclear movement and
    a Voronezh action camp organizer. After the bombing Slivyak was pushed into a
    car by several men who claimed to be Moscow police. The police interrogated and
    threatened Slivyak for around ninety minutes before letting him go. The Moscow
    police thought environmentalists from the anti-nuclear movement were associated
    with the bombing since an earlier bombing occurred on August 31 at Manezh Palace
    in Moscow . After the incident, on August 31, several more bombings occurred which
    agitated many people, leading to the racially profiled arrest of dark-skinned
    Muscovites and visitors to the Russian capital.
  sentences:
  - The technique works backwards from the target to identify a precursor molecule
    and an enzyme that converts it into the target, and then a second precursor that
    can produce the first and so on until a simple, inexpensive molecule becomes the
    beginning of the series. For each precursor, the enzyme is evolved using induced
    mutations and natural selection to produce a more productive version. The evolutionary
    process can be repeated over multiple generations until acceptable productivity
    is achieved. The process does not require high temperature, high pressure, the
    use of exotic catalysts or other elements that can increase costs. The enzyme
    "optimizations" that increase the production of one precursor from another are
    cumulative in that the same precursor productivity improvements can potentially
    be leveraged across multiple target molecules.
  - Right after a bombing in Moscow on September 6, 1999, several anti-nuclear activists
    were detained under suspicion. Vladimir Slivyak was one of the three arrested
    under suspicion. He was an activist in the anti-nuclear movement and a Voronezh
    action camp organizer. After the bombing Slivyak was pushed into a car by several
    men who claimed to be Moscow police. The police interrogated and threatened Slivyak
    for around ninety minutes before letting him go. The Moscow police thought environmentalists
    from the anti-nuclear movement were associated with the bombing since an earlier
    bombing occurred on August 31 at Manezh Palace in Moscow . After the incident,
    on August 31, several more bombings occurred which agitated many people, leading
    to the racially profiled arrest of dark-skinned Muscovites and visitors to the
    Russian capital.
  - One of the main sources of information about the Earth's composition comes from
    understanding the relationship between peridotite and basalt melting. Peridotite
    makes up most of Earth's mantle. Basalt, which is highly concentrated in the Earth's
    oceanic crust, is formed when magma reaches the Earth's surface and cools down
    at a very fast rate. When magma cools, different minerals crystallize at different
    times depending on the cooling temperature of that respective mineral. This ultimately
    changes the chemical composition of the melt as different minerals begin to crystallize.
    Fractional crystallization of elements in basaltic liquids has also been studied
    to observe the composition of lava in the upper mantle. This concept can be applied
    by scientists to give insight on the evolution of Earth's mantle and how concentrations
    of lithophile trace elements have varied over the last 3.5 billion years.
- source_sentence: 'The group designs numerous structural concepts such as frameworks
    and floors like Dalle O''Portune and D-Dalle.

    The timber design office of excellence is an entity specializing in the design
    and optimization of wood construction projects. It stands out for its ability
    to meet the highest demands in terms of performance, durability and aesthetics,
    and is thus recognized for its contribution to the realization of ambitious projects
    in the field of timber construction.'
  sentences:
  - 'The group designs numerous structural concepts such as frameworks and floors
    like Dalle O''Portune and D-Dalle.

    The timber design office of excellence is an entity specializing in the design
    and optimization of wood construction projects. It stands out for its ability
    to meet the highest demands in terms of performance, durability and aesthetics,
    and is thus recognized for its contribution to the realization of ambitious projects
    in the field of timber construction.'
  - 'In waterways, the term bridge strike may be used when a water vessel collides
    with a bridge. This may include a collision to the bridge span or a collision
    to the bridge support structure such as a pier. Bridge protection systems are
    used to mitigate the effects of a ship strike.

    In 2014, the United States Coast Guard published statistics that it investigated
    205 bridge strikes in the eleven years prior to the publication. All of those
    collisions involved involved a fixed, swing, lift or draw bridge. That number
    was 1.2% of all vessel collision incidents investigated by the Coast Guard. The
    primary causal factor was the lack of accurate air draft data, the distance between
    water surface to the top most part of the vessel.'
  - 'Post, Stephen Garrard. Encyclopedia of bioethics. Third edition. Macmillan Reference
    USA, 2003. ISBN 0028657748. ISSN 0950-4125; DOI:10.1108/09504120510573477.  (5-Volume
    Set; 3062 pages).

    Reich, Warren Thomas Encyclopedia of Bioethics. First edition.  New York: Free
    Press, 1978.  ISBN 0029261805.  ISBN 978-0029261804.  (4-Volume Set; 1933 pages)

    Reich, Warren Thomas Encyclopedia of Bioethics. Second edition.  New York: Free
    Press, 1982.  (5-Volume Set; 2950 pages)

    Reich, Warren Thomas Encyclopedia of Bioethics. Third edition.  New York: Simon
    & Schuster Macmillan, 1995; London: Simon and Schuster and Prentice Hall International,
    c1995. Rev. ed. (5-Volume Set; 2950 pages; 464 articles) ISBN 0028973550. ISBN
    978-0028973555.'
- source_sentence: 'Regression is used to make predictions based on the retrieved
    data through statistical trends and statistical modeling. Different uses of this
    technique are used for fetching Photometric redshifts and measurements of physical
    parameters of stars. The approaches are listed below:


    Artificial neural network (ANN)

    Support vector regression (SVR)

    Decision tree

    Random forest

    k-nearest neighbors regression

    Kernel regression

    Principal component regression (PCR)

    Gaussian process

    Least squared regression (LSR)

    Partial least squares regression'
  sentences:
  - 'Regression is used to make predictions based on the retrieved data through statistical
    trends and statistical modeling. Different uses of this technique are used for
    fetching Photometric redshifts and measurements of physical parameters of stars.
    The approaches are listed below:


    Artificial neural network (ANN)

    Support vector regression (SVR)

    Decision tree

    Random forest

    k-nearest neighbors regression

    Kernel regression

    Principal component regression (PCR)

    Gaussian process

    Least squared regression (LSR)

    Partial least squares regression'
  - 'Clandestine chemistry is not limited to drugs; it is also associated with explosives,
    and other illegal chemicals. Of the explosives manufactured illegally, nitroglycerin
    and acetone peroxide are easiest to produce due to the ease with which the precursors
    can be acquired.

    Uncle Fester is a writer who commonly writes about different aspects of clandestine
    chemistry. Secrets of Methamphetamine Manufacture is among his most popular books,
    and is considered required reading for DEA agents. More of his books deal with
    other aspects of clandestine chemistry, including explosives, and poisons. Fester
    is, however, considered by many to be a faulty and unreliable source for information
    in regard to the clandestine manufacture of chemicals.'
  - A novel input representation has been developed consisting of a combination of
    sparse encoding, Blosum encoding, and input derived from hidden Markov models.
    this method predicts T-cell epitopes for the genome of hepatitis C virus and discuss
    possible applications of the prediction method to guide the process of rational
    vaccine design.
- source_sentence: 'Burray and The Barriers

    Undiscovered Scotland: The Churchill Barriers

    Our Past History: The Churchill Barriers Archived 17 December 2006 at the Wayback
    Machine

    Okneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback Machine'
  sentences:
  - "For a neuron, in the limit of \n  \n    \n      \n        b\n        =\n    \
    \    0\n      \n    \n    {\\displaystyle b=0}\n  \n, the map becomes 1D, since\
    \ \n  \n    \n      \n        y\n      \n    \n    {\\displaystyle y}\n  \n converges\
    \ to a constant. If the parameter \n  \n    \n      \n        b\n      \n    \n\
    \    {\\displaystyle b}\n  \n is scanned in a range, different orbits will be\
    \ seen, some periodic, others chaotic, that appear between two fixed points, one\
    \ at \n  \n    \n      \n        x\n        =\n        1\n      \n    \n    {\\\
    displaystyle x=1}\n  \n ; \n  \n    \n      \n        y\n        =\n        1\n\
    \      \n    \n    {\\displaystyle y=1}\n  \n and the other close to the value\
    \ of \n  \n    \n      \n        k\n      \n    \n    {\\displaystyle k}\n  \n\
    \ (which would be the regime excitable).\n\n\n== References =="
  - 'Cerebellar Purkinje neurons have been proposed to have two distinct bursting
    modes: dendritically driven, by dendritic Ca2+ spikes, and somatically driven,
    wherein the persistent Na+ current is the burst initiator and the SK K+ current
    is the burst terminator. Purkinje neurons may utilise these bursting forms in
    information coding to the deep cerebellar nuclei.'
  - 'Burray and The Barriers

    Undiscovered Scotland: The Churchill Barriers

    Our Past History: The Churchill Barriers Archived 17 December 2006 at the Wayback
    Machine

    Okneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback Machine'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---

# SentenceTransformer based on zacbrld/MNLP_M2_document_encoder

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [zacbrld/MNLP_M2_document_encoder](https://huggingface.co/zacbrld/MNLP_M2_document_encoder). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [zacbrld/MNLP_M2_document_encoder](https://huggingface.co/zacbrld/MNLP_M2_document_encoder) <!-- at revision 0256ba97b154a34e25bfdf236061c0fdb0c5d146 -->
- **Maximum Sequence Length:** 256 tokens
- **Output Dimensionality:** 384 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("zacbrld/MNLP_M2_document_encoder")
# Run inference
sentences = [
    'Burray and The Barriers\nUndiscovered Scotland: The Churchill Barriers\nOur Past History: The Churchill Barriers Archived 17 December 2006 at the Wayback Machine\nOkneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback Machine',
    'Burray and The Barriers\nUndiscovered Scotland: The Churchill Barriers\nOur Past History: The Churchill Barriers Archived 17 December 2006 at the Wayback Machine\nOkneypics.com: photos of the barrier Archived 15 May 2008 at the Wayback Machine',
    'Cerebellar Purkinje neurons have been proposed to have two distinct bursting modes: dendritically driven, by dendritic Ca2+ spikes, and somatically driven, wherein the persistent Na+ current is the burst initiator and the SK K+ current is the burst terminator. Purkinje neurons may utilise these bursting forms in information coding to the deep cerebellar nuclei.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset

* Size: 5,489 training samples
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
* Approximate statistics based on the first 1000 samples:
  |         | sentence_0                                                                           | sentence_1                                                                           |
  |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
  | type    | string                                                                               | string                                                                               |
  | details | <ul><li>min: 34 tokens</li><li>mean: 144.23 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 34 tokens</li><li>mean: 144.23 tokens</li><li>max: 256 tokens</li></ul> |
* Samples:
  | sentence_0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | sentence_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>In related work, Smoller, Temple, and Vogler propose that this shockwave may have resulted in our part of the universe having a lower density than that surrounding it, causing the accelerated expansion normally attributed to dark energy.  <br>They also propose that this related theory could be tested: a universe with dark energy should give a figure for the cubic correction to redshift versus luminosity C = −0.180 at a = a whereas for Smoller, Temple, and Vogler's alternative C should be positive rather than negative. They give a more precise calculation for their wave model alternative as: the cubic correction to redshift versus luminosity at a = a is C = 0.359.</code>                                                                                                                                                                      | <code>In related work, Smoller, Temple, and Vogler propose that this shockwave may have resulted in our part of the universe having a lower density than that surrounding it, causing the accelerated expansion normally attributed to dark energy.  <br>They also propose that this related theory could be tested: a universe with dark energy should give a figure for the cubic correction to redshift versus luminosity C = −0.180 at a = a whereas for Smoller, Temple, and Vogler's alternative C should be positive rather than negative. They give a more precise calculation for their wave model alternative as: the cubic correction to redshift versus luminosity at a = a is C = 0.359.</code>                                                                                                                                                                      |
  | <code>Evolution is a central organizing concept in biology. It is the change in heritable characteristics of populations over successive generations. In artificial selection, animals were selectively bred for specific traits.<br> Given that traits are inherited, populations contain a varied mix of traits, and reproduction is able to increase any population, Darwin argued that in the natural world, it was nature that played the role of humans in selecting for specific traits. Darwin inferred that individuals who possessed heritable traits better adapted to their environments are more likely to survive and produce more offspring than other individuals. He further inferred that this would lead to the accumulation of favorable traits over successive generations, thereby increasing the match between the organisms and their environment.</code> | <code>Evolution is a central organizing concept in biology. It is the change in heritable characteristics of populations over successive generations. In artificial selection, animals were selectively bred for specific traits.<br> Given that traits are inherited, populations contain a varied mix of traits, and reproduction is able to increase any population, Darwin argued that in the natural world, it was nature that played the role of humans in selecting for specific traits. Darwin inferred that individuals who possessed heritable traits better adapted to their environments are more likely to survive and produce more offspring than other individuals. He further inferred that this would lead to the accumulation of favorable traits over successive generations, thereby increasing the match between the organisms and their environment.</code> |
  | <code>The total number of engineers employed in the U.S. in 2015 was roughly 1.6 million. Of these, 278,340 were mechanical engineers (17.28%), the largest discipline by size. In 2012, the median annual income of mechanical engineers in the U.S. workforce was $80,580. The median income was highest when working for the government ($92,030), and lowest in education ($57,090). In 2014, the total number of mechanical engineering jobs was projected to grow 5% over the next decade. As of 2009, the average starting salary was $58,800 with a bachelor's degree.</code>                                                                                                                                                                                                                                                                                             | <code>The total number of engineers employed in the U.S. in 2015 was roughly 1.6 million. Of these, 278,340 were mechanical engineers (17.28%), the largest discipline by size. In 2012, the median annual income of mechanical engineers in the U.S. workforce was $80,580. The median income was highest when working for the government ($92,030), and lowest in education ($57,090). In 2014, the total number of mechanical engineering jobs was projected to grow 5% over the next decade. As of 2009, the average starting salary was $58,800 with a bachelor's degree.</code>                                                                                                                                                                                                                                                                                             |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim"
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `num_train_epochs`: 5
- `multi_dataset_batch_sampler`: round_robin

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: no
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1
- `num_train_epochs`: 5
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `tp_size`: 0
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: round_robin

</details>

### Training Logs
| Epoch  | Step | Training Loss |
|:------:|:----:|:-------------:|
| 1.4535 | 500  | 0.0002        |
| 2.9070 | 1000 | 0.0           |
| 4.3605 | 1500 | 0.0007        |


### Framework Versions
- Python: 3.10.11
- Sentence Transformers: 3.4.1
- Transformers: 4.51.3
- PyTorch: 2.6.0
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->