SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Although they are an inexpensive supplier of vitamins,minerals,and high--quality protein,eggs also contain a high level of blood cholesterol ,one of the major causes of heart disease.One egg yolk,in fact,contains a little more than two--thirds of the suggested daily cholesterol limit.     This knowledge has caused egg sales to drop in recent years,which in turn has brought about the development of several alternatives to eating regular eggs.One alternative is to eat substitute eggs. These egg substitutes are not real eggs, but they look somewhat like eggs when they are cooked.They have the advantage of having lower cholesterol rates,and they can be scrambled or used in baking.One disadvantage, however,is that they are not good for frying,poaching,or boiling.A second alternative to regular eggs is a new type of eggs,sometimes called"designer\'\'eggs.These eggs are  produced by hens that are fed low-fat diets consisting of ingredients such as canola oil,flax,and rice bran.In spite of their diets,however,these hens produce eggs that contain  the same amount of cholesterol as regular eggs.Yet,producers of these eggs claim that eating their eggs will not raise the blood cholesterol in humans.     Egg producers claim that their product has been described unfairly.They use scientific studies to back up their claim.And  in tact  studies on the relationship between eggs and human cholesterol levels have brought mixed results.It may be that it is not the type of egg that is the main determinant of cholesterol but the person who is eating the eggs.Some people may be more sensitive to cholesterol from food than other people.In fact,there is evidence that certain dietary fats stimulate the body\'s production of blood cholesterol.Consequently,while it still makes sense to limit one\'s intake of eggs,even designer eggs,it seems that doing this without regulating dietary fat will probably not help reduce the blood cholesterol level. The main cause of the recent drop in egg sales is_. A. the production of substitute eggs and designer eggs. B. the changes in hen\'s diet. C. the increasing price. D. People\'s knowledge of the high level of blood cholesterol in eggs.',
    '**Boiled egg**\n\nBoiled egg:\nBoiled eggs are eggs, typically from a chicken, cooked with their shells unbroken, usually by immersion in boiling water. Hard-boiled eggs are cooked so that the egg white and egg yolk both solidify, while soft-boiled eggs may leave the yolk, and sometimes the white, at least partially liquid and raw. Boiled eggs are a popular breakfast food around the world.',
    '**Egg salad**\n\nEgg salad:\nEgg salad is a dish consisting of chopped hard-boiled or scrambled eggs, mustard, and mayonnaise, and vegetables often including other ingredients such as celery.  It is made mixed with seasonings in the form of herbs, spices and other ingredients, bound with mayonnaise. It is similar to chicken salad, ham salad, macaroni salad, tuna salad, lobster salad, and crab salad. A typical egg salad is made of chopped hard-boiled eggs, mayonnaise, mustard, minced celery and onion, salt, black pepper and paprika. A common use is as a filling for egg sandwiches. It is also often used as a topping for a green salad.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 117,937 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 23 tokens
    • mean: 122.1 tokens
    • max: 256 tokens
    • min: 10 tokens
    • mean: 144.44 tokens
    • max: 256 tokens
    • min: 10 tokens
    • mean: 138.84 tokens
    • max: 256 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    Two hunters rented a small plane to fly them to a forest. They told the pilot to come back and pick them up in about two weeks. By the end of the two weeks, they had hunted a lot of animals and they wanted to put all of them onto the plane. But the pilot said, "This plane can only take one lamb. You'll have to leave the others behind." Then one hunter said, "But last year, another pilot with the same plane let us take two lambs and some other animals onto the plane." So the new pilot thought about it. Finally he said, "OK, since you did it last year, I think this year we can do it again." Then they put all the animals they had hunted onto the plane, and the plane took off. Five minutes later, it crashed. The three men looked around, and one hunter asked the other one, "Where do you think we are now?" The other hunter looked around and said, "I think we're about one mile away from the place where the plane crashed last year." What did the two hunters do in the forest? A. They took a ho... Three wishes joke

    Three wishes joke:
    The three wishes joke (or genie joke) is a joke format in which a character is given three wishes by a supernatural being, and fails to make the best use of them. Common scenarios include releasing a genie from a lamp, catching and agreeing to release a mermaid or magical fish, or crossing paths with the devil. The first two wishes go as expected, with the third wish being misinterpreted, or granted in an unexpected fashion that doesn't reflect the intent of the wish. Alternatively, the wishes are split between three people, with the last person's wish inadvertently or intentionally thwarting or undoing the wishes of the other characters. An example of the three wishes joke runs as follows: Three men are stranded on a desert island, when a bottle washes up on the shore. When they uncork the bottle, a genie appears and offers three wishes. The first wishes to be taken to Paris. The genie snaps his fingers, and the man suddenly finds himself stan...
    Observational learning explains how wolves know how to hunt as a group.
    The ratio of the amount of the oil bill for the month of February to the amount of the oil bill for the month of January was 5:4. If the oil bill for February had been $30 more, the corresponding ratio would have been 3:2. How much was the oil bill for January? A. $60. B. $80. C. $120. D. $140. Divide.

    Hints:
    There are many ways to solve this problem. Let's see two ways we could divide. Place value strategy We can think in terms of hundredths: $\phantom{=}624 \div 0.01$ $= 624.00 \div 0.01$ $= 62{,}400$ hundredths $\div ~1$ hundredth $= 62{,}400$ Fraction multiplication strategy Decimals are a kind of fraction, so we can use fraction multiplication. $\begin{aligned} 624 \div 0.01 &= \dfrac{624.00}{0.01}\\ &= \dfrac{624.00 \times 100}{0.01 \times 100}\\ &= \dfrac{62{,}400}{1}\\ &= 62{,}400 \end{aligned}$ The answer $62{,}400 = 624 \div 0.01$
    Gabriela bought a new pair of glasses at the store when they were having a $30%$ off sale. If the regular price of the pair of glasses was $$72$, how much did Gabriela pay with the discount? $$\ $

    Hints:
    First, find the amount of the discount by multiplying the original price of the item by the discount. ${$72} \times {30%} = \text{?}$ Percent means "out of one hundred," so ${30%}$ is equivalent to ${\dfrac{30}{100}}$ which is also equal to ${30 \div 100}$. ${30 \div 100 = 0.30}$ To find the amount of money saved, multiply ${0.30}$ by the original price. ${0.30} \times {$72} = {$21.60}$ To find the final price Gabriela paid, subtract ${$21.60}$ from the original price. ${$72} - {$21.60} = $50.40$
    In 2013, a report from The Nero England Journal of Medicine showed that increased body weight is related to the death rate for all cancers. This is based on a study involving about 900,000 people, spanning many years. The study, started in 1992 by the American Cancer Society, included men and women from all 50 states. The youngest participants were 30 years old, and the '8verage age was 57. By December 2008, 24% of the participants had died, just a quarter of them from cancers. In analyzing the results, researchers attempted to take account of such potential factors as smoking drinking alcohol, taking aspirin and a wide variety of other factors that might otherwise affect the results. The results are clear the more you weigh, the greater your risk of dying of cancer will be (up to 52% higher for men and 62% for women). In men as well as women, the only cancers that did not have a strong connection with weight were lung cancer and-brain cancer. For women, t... Obesity and cancer

    Obesity and cancer:
    The association between obesity, as defined by a body mass index of 30 or higher, and risk of a variety of types of cancer has received a considerable amount of attention in recent years. Obesity has been associated with an increased risk of esophageal cancer, pancreatic cancer, colorectal cancer, breast cancer (among postmenopausal women), endometrial cancer, kidney cancer, thyroid cancer, liver cancer and gallbladder cancer. Obesity may also lead to increased cancer-related mortality. Obesity has also been described as the fat tissue disease version of cancer, where common features between the two diseases were suggested for the first time.
    Alcohol and cancer

    Alcohol and cancer:
    Alcohol causes cancers of the oesophagus, liver, breast, colon, oral cavity, rectum, pharynx, and larynx, and probably causes cancers of the pancreas. Consumption of alcohol in any quantity can cause cancer. The more alcohol is consumed, the higher the cancer risk, and no amount can be considered safe. Alcoholic beverages were classified as a Group 1 carcinogen by the International Agency for Research on Cancer (IARC) in 1988.3.6% of all cancer cases and 3.5% of cancer deaths worldwide are attributable to consumption of alcohol (more specifically, acetaldehyde, a metabolic derivative of ethanol). 740,000 cases of cancer in 2020 or 4.1% of new cancer cases were attributed to alcohol.Alcohol is thought to cause cancer through three main mechanisms: DNA methylation Oxidative stress Hormonal alterationas well as secondary mechanisms of liver cirrhosis, microbiome dysbiosis, reduced immune system function, retinoid metabolism, increased levels of ...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.0678 500 4.9504
0.1356 1000 4.9323
0.2035 1500 4.9084
0.2713 2000 4.8822
0.3391 2500 4.8753
0.4069 3000 4.8524
0.4748 3500 4.8574
0.5426 4000 4.852
0.6104 4500 4.8373
0.6782 5000 4.8464
0.7461 5500 4.8184
0.8139 6000 4.8328
0.8817 6500 4.8267
0.9495 7000 4.8411

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.7.1
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
-
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Gswrtz/finetuned-triplet-rag-embedder

Finetuned
(753)
this model

Papers for Gswrtz/finetuned-triplet-rag-embedder