zacbrld's picture
Add new SentenceTransformer model
ec7ac6a verified
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:34235
- loss:MultipleNegativesRankingLoss
base_model: zacbrld/MNLP_M2_document_encoder
widget:
- source_sentence: What is This?
sentences:
- Bellcranks are also seen in automotive applications, such as in the linkage connecting
the throttle pedal to the carburetor or connecting the brake pedal to the master
cylinder In vehicle suspensions, bellcranks are used in pullrod and pushrod suspensions
in cars or in the Christie suspension in tanks More vertical suspension designs
such as MacPherson struts may not be feasible in some vehicle designs due to space,
aerodynamic, or other design constraints; bellcranks translate the vertical motion
of the wheel into horizontal motion, allowing the suspension to be mounted transversely
or longitudinally within the vehicle
- DynaMo was also used as the face of the BBC's parental assistance website This
was created for parents to assist children with homework There was also a section
called "DynaMo's Den" which included educational games for children The website
was activated on 2 October 1998
- "The diode equation above is an example of an element constitutive equation of\
\ the general form,\n\n \n \n \n f\n (\n v\n \
\ ,\n i\n )\n =\n 0\n \n \n {\\displaystyle\
\ f(v,i)=0}\n \n\nThis can be thought of as a non-linear resistor The corresponding\
\ constitutive equations for non-linear inductors and capacitors are respectively;\n\
\n \n \n \n f\n (\n v\n ,\n φ\n \
\ )\n =\n 0\n \n \n {\\displaystyle f(v,\\varphi\
\ )=0}\n \n\n \n \n \n f\n (\n v\n ,\n \
\ q\n )\n =\n 0\n \n \n {\\displaystyle\
\ f(v,q)=0}\n \n\nwhere f is any arbitrary function, φ is the stored magnetic\
\ flux and q is the stored charge"
- source_sentence: algorithm explanation
sentences:
- 'Descriptive statistics
Average
Mean
Median
Mode
Measures of scale
Variance
Standard deviation
Median absolute deviation
Correlation
Polychoric correlation
Outlier
Statistical graphics
Histogram
Frequency distribution
Quantile
Survival function
Failure rate
Scatter plot
Bar chart'
- 'The various fields and topics that projects engineers are involved with include:
Work breakdown structure: a deliverable-oriented breakdown of a project into smaller
components
Gantt chart: type of bar chart that illustrates a project schedule
Critical Path Analysis: an algorithm for scheduling a set of project activities
Program evaluation and review technique: a statistical tool which was designed
to analyze and represent the tasks involved in completing a given project
Graphical Evaluation and Review Technique: network analysis technique that allows
probabilistic treatment both network logic and estimation of activity duration
Petri Nets: one of several mathematical modeling languages for the description
of distributed systems'
- Jessiko was marketed as a luxury decoration for businesses such as hotels, restaurants,
and museums Tiraby expressed hope that one day it would be common to find his
invention in household ponds and swimming pools
- source_sentence: 'The firm was founded as SECOR Ltd in 1994 by John Leeson, Alan
Sheppard, and David Richards After establishing the company in Oxford, United
Kingdom, in 1994, David oversaw the growth of the business from a small UK operator
into an environmental consultancies in the UK, with international operations across
Africa, Australasia, Canada, Europe, and the US
In 2000, the senior management team completed a management buyout and the company''s
name was changed to SLR Consulting Limited In 2004 they secured funding from Livingbridge,
who invested £4 85 million as part of a £13 million investment including other
partners, and took a significant minority stake in the company In 2008, 3i invested
£32 5 million in the firm, and replaced Livingbridge with a significant minority
stake In March 2018, Charterhouse Capital Partners (CCP) acquired a majority shareholding
in the business In June 2022 Charterhouse Capital Partners agreed to a sale of
SLR Consulting to Ares Management private equity partners David Richards was Chief
Executive Officer from 1994–2013 In line with the Group''s succession plans, Neil
Penhall, formerly Managing Director of SLR Consulting and an Executive Director
of SLR Management, assumed the role of CEO'
sentences:
- 'Institute for Transuranium Elements (ITU)
Institute for the Protection and the Security of the Citizen (IPSC)
Institute for Environment and Sustainability (IES)
Institute for Health and Consumer Protection (IHCP)
Institute for Energy (IE)
Institute for Prospective Technological Studies (IPTS)'
- Project NExT was founded by James (Jim) Leitzel (Ohio State University) and Chris
Stevens (Saint Louis University) The first fellows were selected in 1994 Jim Leitzel
died in 1998, and Aparna Higgins (University of Dayton) and Joe Gallian (University
of Minnesota Duluth) became co-directors of Project NExT Chris Stevens stepped
down as director in 2010, and was succeeded by Aparna Higgins and Joe Gallian
Judith Covington (Louisiana State University, Shreveport) and Gavin LaRose (University
of Michigan) first served as Associate Co-Directors and later became Co-Directors
In 2007, the total number of fellows surpassed 1000 By 2017 the total number of
fellows reached 1700 In 2023 Christine Kelley became director
- Quantum secure communication is a method that is expected to be 'quantum safe'
in the advent of quantum computing systems that could break current cryptography
systems using methods such as Shor's algorithm These methods include quantum key
distribution (QKD), a method of transmitting information using entangled light
in a way that makes any interception of the transmission obvious to the user Another
method is the quantum random number generator, which is capable of producing truly
random numbers unlike non-quantum algorithms that merely imitate randomness
- source_sentence: chemical reaction
sentences:
- With suitably encoded scales (multitrack, vernier, digital code, or pseudo-random
code) an encoder can determine its position without movement or needing to find
a reference position Such absolute encoders also communicate using serial communication
protocols Many of these protocols are proprietary (e g , Fanuc, Mitsubishi, FeeDat
(Fagor Automation), Heidenhain EnDat, DriveCliq, Panasonic, Yaskawa) but open
standards such as BiSS are now appearing, which avoid tying users to a particular
supplier
- Bonneau, Pierre; Allens, Gaspard d' (2020) Cent mille ans Bure ou le scandale
enfoui des déchets nucléaires [One hundred thousand years Bure, or the buried
scandal of nuclear waste] Illustrated by Cécile Guillard La Revue dessinée - Seuil
ISBN 978-2-02-145982-1
- The reason why MACE is heavily researched is that it allows completely anisotropic
etching of silicon substrates which is not possible with other wet chemical etching
methods (see figure to the right) Usually the silicon substrate is covered with
a protective layer such as photoresist before it is immersed in an etching solution
The etching solution usually has no preferred direction of attacking the substrate,
therefore isotropic etching takes place In semiconductor engineering, however
it is often required that the sidewalls of the etched trenches are steep This
is usually realized with methods that operate in the gas-phase such as reactive
ion etching These methods require expensive equipment compared to simple wet etching
MACE, in principle allows the fabrication of steep trenches but is still cheap
compared to gas-phase etching methods
- source_sentence: synthesis method
sentences:
- STEMNET used to receive funding from the Department for Education and Skills Since
June 2007, it receives funding from the Department for Children, Schools and Families
and Department for Innovation, Universities and Skills, since STEMNET sits on
the chronological dividing point (age 16) of both of the new departments
- The Arab States of the Persian Gulf plan to start their own joint civilian nuclear
program An agreement in the final days of the Bush administration provided for
cooperation between the United Arab Emirates and the United States of America
in which the United States would sell the UAE nuclear reactors and nuclear fuel
The UAE would, in return, renounce their right to enrich uranium for their civilian
nuclear program At the time of signing, this agreement was touted as a way to
reduce risks of nuclear proliferation in the Persian Gulf However, Mustafa Alani
of the Dubai-based Gulf Research Center stated that, should the Nuclear Non-Proliferation
Treaty collapse, nuclear reactors such as those slated to be sold to the UAE under
this agreement could provide the UAE with a path toward a nuclear weapon, raising
the specter of further nuclear proliferation In March 2007, foreign ministers
of the six-member Gulf Cooperation Council met in Saudi Arabia to discuss progress
in plans agreed in December 2006, for a joint civilian nuclear program
- Timber framing dates back thousands of years, and has been used in many parts
of the world during various periods such as ancient Japan, Europe and medieval
England in localities where timber was in good supply and building stone and the
skills to work it were not The use of timber framing in buildings provides their
complete skeletal framing which offers some structural benefits as the timber
frame, if properly engineered, lends itself to better seismic survivability
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# SentenceTransformer based on zacbrld/MNLP_M2_document_encoder
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [zacbrld/MNLP_M2_document_encoder](https://huggingface.co/zacbrld/MNLP_M2_document_encoder). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [zacbrld/MNLP_M2_document_encoder](https://huggingface.co/zacbrld/MNLP_M2_document_encoder) <!-- at revision 6f1d702dcb1d5e9fd30b691c84fadd9a1704a148 -->
- **Maximum Sequence Length:** 256 tokens
- **Output Dimensionality:** 384 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("zacbrld/MNLP_M3_document_encoder_V1")
# Run inference
sentences = [
'synthesis method',
'STEMNET used to receive funding from the Department for Education and Skills Since June 2007, it receives funding from the Department for Children, Schools and Families and Department for Innovation, Universities and Skills, since STEMNET sits on the chronological dividing point (age 16) of both of the new departments',
'The Arab States of the Persian Gulf plan to start their own joint civilian nuclear program An agreement in the final days of the Bush administration provided for cooperation between the United Arab Emirates and the United States of America in which the United States would sell the UAE nuclear reactors and nuclear fuel The UAE would, in return, renounce their right to enrich uranium for their civilian nuclear program At the time of signing, this agreement was touted as a way to reduce risks of nuclear proliferation in the Persian Gulf However, Mustafa Alani of the Dubai-based Gulf Research Center stated that, should the Nuclear Non-Proliferation Treaty collapse, nuclear reactors such as those slated to be sold to the UAE under this agreement could provide the UAE with a path toward a nuclear weapon, raising the specter of further nuclear proliferation In March 2007, foreign ministers of the six-member Gulf Cooperation Council met in Saudi Arabia to discuss progress in plans agreed in December 2006, for a joint civilian nuclear program',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 34,235 training samples
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
* Approximate statistics based on the first 1000 samples:
| | sentence_0 | sentence_1 |
|:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 3 tokens</li><li>mean: 21.24 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 34 tokens</li><li>mean: 133.62 tokens</li><li>max: 256 tokens</li></ul> |
* Samples:
| sentence_0 | sentence_1 |
|:----------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>chemistry experiment</code> | <code>Since 1982, research has been conducted to develop technologies, commonly referred to as electronic noses, that could detect and recognize odors and flavors Application areas include food, medicine and the environment</code> |
| <code>quantum physics</code> | <code>Hydro electric - Hydro-electric turbomachinery uses potential energy stored in water to flow over an open impeller to turn a generator which creates electricity<br>Steam turbines - Steam turbines used in power generation come in many different variations The overall principle is high pressure steam is forced over blades attached to a shaft, which turns a generator As the steam travels through the turbine, it passes through smaller blades causing the shaft to spin faster, creating more electricity Gas turbines - Gas turbines work much like steam turbines Air is forced in through a series of blades that turn a shaft Then fuel is mixed with the air and causes a combustion reaction, increasing the power This then causes the shaft to spin faster, creating more electricity Windmills - Also known as a wind turbine, windmills are increasing in popularity for their ability to efficiently use the wind to generate electricity Although they come in many shapes and sizes, the most common one is the la...</code> |
| <code>physics law</code> | <code>Backlash in gear couplings allows for slight angular misalignment There can be significant backlash in unsynchronized transmissions because of the intentional gap between the dogs in dog clutches The gap is necessary to engage dogs when input shaft (engine) speed and output shaft (driveshaft) speed are imperfectly synchronized If there was a smaller clearance, it would be nearly impossible to engage the gears because the dogs would interfere with each other in most configurations In synchronized transmissions, synchromesh solves this problem However, backlash is undesirable in precision positioning applications such as machine tool tables It can be minimized by choosing ball screws or leadscrews with preloaded nuts, and mounting them in preloaded bearings A preloaded bearing uses a spring and/or a second bearing to provide a compressive axial force that maintains bearing surfaces in contact despite reversal of the load direction</code> |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `num_train_epochs`: 2
- `multi_dataset_batch_sampler`: round_robin
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: no
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 8
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1
- `num_train_epochs`: 2
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: round_robin
</details>
### Training Logs
| Epoch | Step | Training Loss |
|:------:|:----:|:-------------:|
| 0.1168 | 500 | 1.465 |
| 0.2336 | 1000 | 1.189 |
| 0.3505 | 1500 | 1.1209 |
| 0.4673 | 2000 | 1.0333 |
| 0.5841 | 2500 | 0.993 |
| 0.7009 | 3000 | 0.9573 |
| 0.8178 | 3500 | 0.9275 |
| 0.9346 | 4000 | 0.9177 |
| 1.0514 | 4500 | 0.8241 |
| 1.1682 | 5000 | 0.7726 |
| 1.2850 | 5500 | 0.7685 |
| 1.4019 | 6000 | 0.7623 |
| 1.5187 | 6500 | 0.7668 |
| 1.6355 | 7000 | 0.7556 |
| 1.7523 | 7500 | 0.7002 |
| 1.8692 | 8000 | 0.7363 |
| 1.9860 | 8500 | 0.7396 |
### Framework Versions
- Python: 3.12.8
- Sentence Transformers: 3.4.1
- Transformers: 4.52.2
- PyTorch: 2.7.0+cu126
- Accelerate: 1.3.0
- Datasets: 3.2.0
- Tokenizers: 0.21.0
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->