zacbrld's picture
Add new SentenceTransformer model
ec7ac6a verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:34235
  - loss:MultipleNegativesRankingLoss
base_model: zacbrld/MNLP_M2_document_encoder
widget:
  - source_sentence: What is This?
    sentences:
      - >-
        Bellcranks are also seen in automotive applications, such as in the
        linkage connecting the throttle pedal to the carburetor or connecting
        the brake pedal to the master cylinder In vehicle suspensions,
        bellcranks are used in pullrod and pushrod suspensions in cars or in the
        Christie suspension in tanks More vertical suspension designs such as
        MacPherson struts may not be feasible in some vehicle designs due to
        space, aerodynamic, or other design constraints; bellcranks translate
        the vertical motion of the wheel into horizontal motion, allowing the
        suspension to be mounted transversely or longitudinally within the
        vehicle
      - >-
        DynaMo was also used as the face of the BBC's parental assistance
        website This was created for parents to assist children with homework
        There was also a section called "DynaMo's Den" which included
        educational games for children The website was activated on 2 October
        1998
      - >-
        The diode equation above is an example of an element constitutive
        equation of the general form,

          
            
              
                f
                (
                v
                ,
                i
                )
                =
                0
              
            
            {\displaystyle f(v,i)=0}
          

        This can be thought of as a non-linear resistor The corresponding
        constitutive equations for non-linear inductors and capacitors are
        respectively;

          
            
              
                f
                (
                v
                ,
                φ
                )
                =
                0
              
            
            {\displaystyle f(v,\varphi )=0}
          

          
            
              
                f
                (
                v
                ,
                q
                )
                =
                0
              
            
            {\displaystyle f(v,q)=0}
          

        where f is any arbitrary function, φ is the stored magnetic flux and q
        is the stored charge
  - source_sentence: algorithm explanation
    sentences:
      - |-
        Descriptive statistics
        Average
        Mean
        Median
        Mode
        Measures of scale
        Variance
        Standard deviation
        Median absolute deviation
        Correlation
        Polychoric correlation
        Outlier
        Statistical graphics
        Histogram
        Frequency distribution
        Quantile
        Survival function
        Failure rate
        Scatter plot
        Bar chart
      - >-
        The various fields and topics that projects engineers are involved with
        include:


        Work breakdown structure: a deliverable-oriented breakdown of a project
        into smaller components

        Gantt chart:  type of bar chart that illustrates a project schedule

        Critical Path Analysis: an algorithm for scheduling a set of project
        activities

        Program evaluation and review technique: a statistical tool which was
        designed to analyze and represent the tasks involved in completing a
        given project

        Graphical Evaluation and Review Technique: network analysis technique
        that allows probabilistic treatment both network logic and estimation of
        activity duration

        Petri Nets: one of several mathematical modeling languages for the
        description of distributed systems
      - >-
        Jessiko was marketed as a luxury decoration for businesses such as
        hotels, restaurants, and museums Tiraby expressed hope that one day it
        would be common to find his invention in household ponds and swimming
        pools
  - source_sentence: >-
      The firm was founded as SECOR Ltd in 1994 by John Leeson, Alan Sheppard,
      and David Richards After establishing the company in Oxford, United
      Kingdom, in 1994, David oversaw the growth of the business from a small UK
      operator into an environmental consultancies in the UK, with international
      operations across Africa, Australasia, Canada, Europe, and the US

      In 2000, the senior management team completed a management buyout and the
      company's name was changed to SLR Consulting Limited In 2004 they secured
      funding from Livingbridge, who invested £4 85 million as part of a £13
      million investment including other partners, and took a significant
      minority stake in the company In 2008, 3i invested £32 5 million in the
      firm, and replaced Livingbridge with a significant minority stake In March
      2018, Charterhouse Capital Partners (CCP) acquired a majority shareholding
      in the business In June 2022 Charterhouse Capital Partners agreed to a
      sale of SLR Consulting to Ares Management private equity partners David
      Richards was Chief Executive Officer from 1994–2013 In line with the
      Group's succession plans, Neil Penhall, formerly Managing Director of SLR
      Consulting and an Executive Director of SLR Management, assumed the role
      of CEO
    sentences:
      - |-
        Institute for Transuranium Elements (ITU)
        Institute for the Protection and the Security of the Citizen (IPSC)
        Institute for Environment and Sustainability (IES)
        Institute for Health and Consumer Protection (IHCP)
        Institute for Energy (IE)
        Institute for Prospective Technological Studies (IPTS)
      - >-
        Project NExT was founded by James (Jim) Leitzel (Ohio State University)
        and Chris Stevens (Saint Louis University) The first fellows were
        selected in 1994 Jim Leitzel died in 1998, and Aparna Higgins
        (University of Dayton) and Joe Gallian (University of Minnesota Duluth)
        became co-directors of Project NExT Chris Stevens stepped down as
        director in 2010, and was succeeded by Aparna Higgins and Joe Gallian
        Judith Covington (Louisiana State University, Shreveport) and Gavin
        LaRose (University of Michigan) first served as Associate Co-Directors
        and later became Co-Directors In 2007, the total number of fellows
        surpassed 1000 By 2017 the total number of fellows reached 1700 In 2023
        Christine Kelley became director
      - >-
        Quantum secure communication is a method that is expected to be 'quantum
        safe' in the advent of quantum computing systems that could break
        current cryptography systems using methods such as Shor's algorithm
        These methods include quantum key distribution (QKD), a method of
        transmitting information using entangled light in a way that makes any
        interception of the transmission obvious to the user Another method is
        the quantum random number generator, which is capable of producing truly
        random numbers unlike non-quantum algorithms that merely imitate
        randomness
  - source_sentence: chemical reaction
    sentences:
      - >-
        With suitably encoded scales (multitrack, vernier, digital code, or
        pseudo-random code) an encoder can determine its position without
        movement or needing to find a reference position Such absolute encoders
        also communicate using serial communication protocols Many of these
        protocols are proprietary (e g , Fanuc, Mitsubishi, FeeDat (Fagor
        Automation), Heidenhain EnDat, DriveCliq, Panasonic, Yaskawa) but open
        standards such as BiSS are now appearing, which avoid tying users to a
        particular supplier
      - >-
        Bonneau, Pierre; Allens, Gaspard d' (2020) Cent mille ans Bure ou le
        scandale enfoui des déchets nucléaires [One hundred thousand years Bure,
        or the buried scandal of nuclear waste] Illustrated by Cécile Guillard
        La Revue dessinée - Seuil ISBN 978-2-02-145982-1
      - >-
        The reason why MACE is heavily researched is that it allows completely
        anisotropic etching of silicon substrates which is not possible with
        other wet chemical etching methods (see figure to the right) Usually the
        silicon substrate is covered with a protective layer such as photoresist
        before it is immersed in an etching solution The etching solution
        usually has no preferred direction of attacking the substrate, therefore
        isotropic etching takes place In semiconductor engineering, however it
        is often required that the sidewalls of the etched trenches are steep
        This is usually realized with methods that operate in the gas-phase such
        as reactive ion etching These methods require expensive equipment
        compared to simple wet etching MACE, in principle allows the fabrication
        of steep trenches but is still cheap compared to gas-phase etching
        methods
  - source_sentence: synthesis method
    sentences:
      - >-
        STEMNET used to receive funding from the Department for Education and
        Skills Since June 2007, it receives funding from the Department for
        Children, Schools and Families and Department for Innovation,
        Universities and Skills, since STEMNET sits on the chronological
        dividing point (age 16) of both of the new departments
      - >-
        The Arab States of the Persian Gulf plan to start their own joint
        civilian nuclear program An agreement in the final days of the Bush
        administration provided for cooperation between the United Arab Emirates
        and the United States of America in which the United States would sell
        the UAE nuclear reactors and nuclear fuel The UAE would, in return,
        renounce their right to enrich uranium for their civilian nuclear
        program At the time of signing, this agreement was touted as a way to
        reduce risks of nuclear proliferation in the Persian Gulf However,
        Mustafa Alani of the Dubai-based Gulf Research Center stated that,
        should the Nuclear Non-Proliferation Treaty collapse, nuclear reactors
        such as those slated to be sold to the UAE under this agreement could
        provide the UAE with a path toward a nuclear weapon, raising the specter
        of further nuclear proliferation In March 2007, foreign ministers of the
        six-member Gulf Cooperation Council met in Saudi Arabia to discuss
        progress in plans agreed in December 2006, for a joint civilian nuclear
        program
      - >-
        Timber framing dates back thousands of years, and has been used in many
        parts of the world during various periods such as ancient Japan, Europe
        and medieval England in localities where timber was in good supply and
        building stone and the skills to work it were not The use of timber
        framing in buildings provides their complete skeletal framing which
        offers some structural benefits as the timber frame, if properly
        engineered, lends itself to better seismic survivability
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on zacbrld/MNLP_M2_document_encoder

This is a sentence-transformers model finetuned from zacbrld/MNLP_M2_document_encoder. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: zacbrld/MNLP_M2_document_encoder
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("zacbrld/MNLP_M3_document_encoder_V1")
# Run inference
sentences = [
    'synthesis method',
    'STEMNET used to receive funding from the Department for Education and Skills Since June 2007, it receives funding from the Department for Children, Schools and Families and Department for Innovation, Universities and Skills, since STEMNET sits on the chronological dividing point (age 16) of both of the new departments',
    'The Arab States of the Persian Gulf plan to start their own joint civilian nuclear program An agreement in the final days of the Bush administration provided for cooperation between the United Arab Emirates and the United States of America in which the United States would sell the UAE nuclear reactors and nuclear fuel The UAE would, in return, renounce their right to enrich uranium for their civilian nuclear program At the time of signing, this agreement was touted as a way to reduce risks of nuclear proliferation in the Persian Gulf However, Mustafa Alani of the Dubai-based Gulf Research Center stated that, should the Nuclear Non-Proliferation Treaty collapse, nuclear reactors such as those slated to be sold to the UAE under this agreement could provide the UAE with a path toward a nuclear weapon, raising the specter of further nuclear proliferation In March 2007, foreign ministers of the six-member Gulf Cooperation Council met in Saudi Arabia to discuss progress in plans agreed in December 2006, for a joint civilian nuclear program',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 34,235 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 3 tokens
    • mean: 21.24 tokens
    • max: 256 tokens
    • min: 34 tokens
    • mean: 133.62 tokens
    • max: 256 tokens
  • Samples:
    sentence_0 sentence_1
    chemistry experiment Since 1982, research has been conducted to develop technologies, commonly referred to as electronic noses, that could detect and recognize odors and flavors Application areas include food, medicine and the environment
    quantum physics Hydro electric - Hydro-electric turbomachinery uses potential energy stored in water to flow over an open impeller to turn a generator which creates electricity
    Steam turbines - Steam turbines used in power generation come in many different variations The overall principle is high pressure steam is forced over blades attached to a shaft, which turns a generator As the steam travels through the turbine, it passes through smaller blades causing the shaft to spin faster, creating more electricity Gas turbines - Gas turbines work much like steam turbines Air is forced in through a series of blades that turn a shaft Then fuel is mixed with the air and causes a combustion reaction, increasing the power This then causes the shaft to spin faster, creating more electricity Windmills - Also known as a wind turbine, windmills are increasing in popularity for their ability to efficiently use the wind to generate electricity Although they come in many shapes and sizes, the most common one is the la...
    physics law Backlash in gear couplings allows for slight angular misalignment There can be significant backlash in unsynchronized transmissions because of the intentional gap between the dogs in dog clutches The gap is necessary to engage dogs when input shaft (engine) speed and output shaft (driveshaft) speed are imperfectly synchronized If there was a smaller clearance, it would be nearly impossible to engage the gears because the dogs would interfere with each other in most configurations In synchronized transmissions, synchromesh solves this problem However, backlash is undesirable in precision positioning applications such as machine tool tables It can be minimized by choosing ball screws or leadscrews with preloaded nuts, and mounting them in preloaded bearings A preloaded bearing uses a spring and/or a second bearing to provide a compressive axial force that maintains bearing surfaces in contact despite reversal of the load direction
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 2
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.1168 500 1.465
0.2336 1000 1.189
0.3505 1500 1.1209
0.4673 2000 1.0333
0.5841 2500 0.993
0.7009 3000 0.9573
0.8178 3500 0.9275
0.9346 4000 0.9177
1.0514 4500 0.8241
1.1682 5000 0.7726
1.2850 5500 0.7685
1.4019 6000 0.7623
1.5187 6500 0.7668
1.6355 7000 0.7556
1.7523 7500 0.7002
1.8692 8000 0.7363
1.9860 8500 0.7396

Framework Versions

  • Python: 3.12.8
  • Sentence Transformers: 3.4.1
  • Transformers: 4.52.2
  • PyTorch: 2.7.0+cu126
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}