Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/emb-all-MiniLM-L6-v2-squad-5-epochs")
# Run inference
sentences = [
'Is the strength of the modulus of rupture or elasticity increased more when wood is dried?',
'The greatest strength increase due to drying is in the ultimate crushing strength, and strength at elastic limit in endwise compression; these are followed by the modulus of rupture, and stress at elastic limit in cross-bending, while the modulus of elasticity is least affected.',
'The greatest strength increase due to drying is in the ultimate crushing strength, and strength at elastic limit in endwise compression; these are followed by the modulus of rupture, and stress at elastic limit in cross-bending, while the modulus of elasticity is least affected.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
gooqa-devTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.4106 |
question, context, and negative| question | context | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| question | context | negative |
|---|---|---|
When is a beer at its most flavorful? |
Drinking chilled beer began with the development of artificial refrigeration and by the 1870s, was spread in those countries that concentrated on brewing pale lager. Chilling beer makes it more refreshing, though below 15.5 °C the chilling starts to reduce taste awareness and reduces it significantly below 10 °C (50 °F). Beer served unchilled—either cool or at room temperature, reveal more of their flavours. Cask Marque, a non-profit UK beer organisation, has set a temperature standard range of 12°–14 °C (53°–57 °F) for cask ales to be served. |
The process of making beer is known as brewing. A dedicated building for the making of beer is called a brewery, though beer can be made in the home and has been for much of its history. A company that makes beer is called either a brewery or a brewing company. Beer made on a domestic scale for non-commercial reasons is classified as homebrewing regardless of where it is made, though most homebrewed beer is made in the home. Brewing beer is subject to legislation and taxation in developed countries, which from the late 19th century largely restricted brewing to a commercial operation only. However, the UK government relaxed legislation in 1963, followed by Australia in 1972 and the US in 1978, allowing homebrewing to become a popular hobby. |
When was the BeiDou-1C satellite launched? |
The first satellite, BeiDou-1A, was launched on 30 October 2000, followed by BeiDou-1B on 20 December 2000. The third satellite, BeiDou-1C (a backup satellite), was put into orbit on 25 May 2003. The successful launch of BeiDou-1C also meant the establishment of the BeiDou-1 navigation system. |
The first satellite, BeiDou-1A, was launched on October 31, 2000. The second satellite, BeiDou-1B, was successfully launched on December 21, 2000. The last operational satellite of the constellation, BeiDou-1C, was launched on May 25, 2003. |
The hotel that sold for the most money in 2014 was which in NYC? |
Manhattan was on track to have an estimated 90,000 hotel rooms at the end of 2014, a 10% increase from 2013. In October 2014, the Anbang Insurance Group, based in China, purchased the Waldorf Astoria New York for US$1.95 billion, making it the world's most expensive hotel ever sold. |
Real estate is a major force in the city's economy, as the total value of all New York City property was assessed at US$914.8 billion for the 2015 fiscal year. The Time Warner Center is the property with the highest-listed market value in the city, at US$1.1 billion in 2006. New York City is home to some of the nation's—and the world's—most valuable real estate. 450 Park Avenue was sold on July 2, 2007 for US$510 million, about $1,589 per square foot ($17,104/m²), breaking the barely month-old record for an American office building of $1,476 per square foot ($15,887/m²) set in the June 2007 sale of 660 Madison Avenue. According to Forbes, in 2014, Manhattan was home to six of the top ten zip codes in the United States by median housing price. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question, context, and negative_1| question | context | negative_1 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| question | context | negative_1 |
|---|---|---|
What percentage of biodiversity has the planet lost since 1970 |
In absolute terms, the planet has lost 52% of its biodiversity since 1970 according to a 2014 study by the World Wildlife Fund. The Living Planet Report 2014 claims that "the number of mammals, birds, reptiles, amphibians and fish across the globe is, on average, about half the size it was 40 years ago". Of that number, 39% accounts for the terrestrial wildlife gone, 39% for the marine wildlife gone, and 76% for the freshwater wildlife gone. Biodiversity took the biggest hit in Latin America, plummeting 83 percent. High-income countries showed a 10% increase in biodiversity, which was canceled out by a loss in low-income countries. This is despite the fact that high-income countries use five times the ecological resources of low-income countries, which was explained as a result of process whereby wealthy nations are outsourcing resource depletion to poorer nations, which are suffering the greatest ecosystem losses. |
In absolute terms, the planet has lost 52% of its biodiversity since 1970 according to a 2014 study by the World Wildlife Fund. The Living Planet Report 2014 claims that "the number of mammals, birds, reptiles, amphibians and fish across the globe is, on average, about half the size it was 40 years ago". Of that number, 39% accounts for the terrestrial wildlife gone, 39% for the marine wildlife gone, and 76% for the freshwater wildlife gone. Biodiversity took the biggest hit in Latin America, plummeting 83 percent. High-income countries showed a 10% increase in biodiversity, which was canceled out by a loss in low-income countries. This is despite the fact that high-income countries use five times the ecological resources of low-income countries, which was explained as a result of process whereby wealthy nations are outsourcing resource depletion to poorer nations, which are suffering the greatest ecosystem losses. |
What is the per capita income of Delhi as of 2013? |
New Delhi is the largest commercial city in northern India. It has an estimated net State Domestic Product (FY 2010) of ₹1595 billion (US$23 billion) in nominal terms and ~₹6800 billion (US$100 billion) in PPP terms. As of 2013, the per capita income of Delhi was Rs. 230000, second highest in India after Goa. GSDP in Delhi at the current prices for 2012-13 is estimated at Rs 3.88 trillion (short scale) against Rs 3.11 trillion (short scale) in 2011-12. |
The Government of National Capital Territory of Delhi does not release any economic figures specifically for New Delhi but publishes an official economic report on the whole of Delhi annually. According to the Economic Survey of Delhi, the metropolis has a net State Domestic Product (SDP) of Rs. 83,085 crores (for the year 2004–05) and a per capita income of Rs. 53,976($1,200). In the year 2008–09 New Delhi had a Per Capita Income of Rs.1,16,886 ($2,595).It grew by 16.2% to reach Rs.1,35,814 ($3,018) in 2009–10 fiscal. New Delhi's Per Capita GDP (at PPP) was at $6,860 during 2009–10 fiscal, making it one of the richest cities in India. The tertiary sector contributes 78.4% of Delhi's gross SDP followed by secondary and primary sectors with 20.2% and 1.4% contribution respectively. |
Which Prussian general commanded the attack against the French at St. Privat? |
By 16:50, with the Prussian southern attacks in danger of breaking up, the Prussian 3rd Guards Infantry Brigade of the Second Army opened an attack against the French positions at St. Privat which were commanded by General Canrobert. At 17:15, the Prussian 4th Guards Infantry Brigade joined the advance followed at 17:45 by the Prussian 1st Guards Infantry Brigade. All of the Prussian Guard attacks were pinned down by lethal French gunfire from the rifle pits and trenches. At 18:15 the Prussian 2nd Guards Infantry Brigade, the last of the 1st Guards Infantry Division, was committed to the attack on St. Privat while Steinmetz committed the last of the reserves of the First Army across the Mance Ravine. By 18:30, a considerable portion of the VII and VIII Corps disengaged from the fighting and withdrew towards the Prussian positions at Rezonville. |
With the defeat of the First Army, Prince Frederick Charles ordered a massed artillery attack against Canrobert's position at St. Privat to prevent the Guards attack from failing too. At 19:00 the 3rd Division of Fransecky's II Corps of the Second Army advanced across Ravine while the XII Corps cleared out the nearby town of Roncourt and with the survivors of the 1st Guards Infantry Division launched a fresh attack against the ruins of St. Privat. At 20:00, the arrival of the Prussian 4th Infantry Division of the II Corps and with the Prussian right flank on Mance Ravine, the line stabilised. By then, the Prussians of the 1st Guards Infantry Division and the XII and II Corps captured St. Privat forcing the decimated French forces to withdraw. With the Prussians exhausted from the fighting, the French were now able to mount a counter-attack. General Bourbaki, however, refused to commit the reserves of the French Old Guard to the battle because, by that time, he considered the overall si... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 256per_device_eval_batch_size: 256num_train_epochs: 5warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 256per_device_eval_batch_size: 256per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss | gooqa-dev_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.3282 |
| 0.5780 | 100 | 0.548 | 0.8841 | 0.3910 |
| 1.1561 | 200 | 0.4777 | 0.8530 | 0.4024 |
| 1.7341 | 300 | 0.4008 | 0.8470 | 0.4040 |
| 2.3121 | 400 | 0.3563 | 0.8390 | 0.4090 |
| 2.8902 | 500 | 0.3234 | 0.8276 | 0.4064 |
| 3.4682 | 600 | 0.2866 | 0.8325 | 0.4068 |
| 4.0462 | 700 | 0.274 | 0.8348 | 0.4028 |
| 4.6243 | 800 | 0.2512 | 0.8284 | 0.4062 |
| -1 | -1 | - | - | 0.4106 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
sentence-transformers/all-MiniLM-L6-v2