Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
9
This is a sentence-transformers model finetuned from dunzhang/stella_en_400M_v5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 1024, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Title: \nText: what was the average for "other" loans held in 2012 and 2011?',
'Title: \nText: LOANS HELD FOR SALE Table 15: Loans Held For Sale\n| In millions | December 312012 | December 312011 |\n| Commercial mortgages at fair value | $772 | $843 |\n| Commercial mortgages at lower of cost or market | 620 | 451 |\n| Total commercial mortgages | 1,392 | 1,294 |\n| Residential mortgages at fair value | 2,096 | 1,415 |\n| Residential mortgages at lower of cost or market | 124 | 107 |\n| Total residential mortgages | 2,220 | 1,522 |\n| Other | 81 | 120 |\n| Total | $3,693 | $2,936 |\nWe stopped originating commercial mortgage loans held for sale designated at fair value in 2008 and continue pursuing opportunities to reduce these positions at appropriate prices.\nAt December 31, 2012, the balance relating to these loans was $772 million, compared to $843 million at December 31, 2011.\nWe sold $32 million in unpaid principal balances of these commercial mortgage loans held for sale carried at fair value in 2012 and sold $25 million in 2011.',
'Title: \nText: Investments and Derivative Instruments (continued) Security Unrealized Loss Aging The following tables present the Company’s unrealized loss aging for AFS securities by type and length of time the security was in a continuous unrealized loss position.\n| | December 31, 2011 |\n| | Less Than 12 Months | 12 Months or More | Total |\n| | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized |\n| | Cost | Value | Losses | Cost | Value | Losses | Cost | Value | Losses |\n| ABS | $629 | $594 | $-35 | $1,169 | $872 | $-297 | $1,798 | $1,466 | $-332 |\n| CDOs | 81 | 59 | -22 | 2,709 | 2,383 | -326 | 2,790 | 2,442 | -348 |\n| CMBS | 1,297 | 1,194 | -103 | 2,144 | 1,735 | -409 | 3,441 | 2,929 | -512 |\n| Corporate [1] | 4,388 | 4,219 | -169 | 3,268 | 2,627 | -570 | 7,656 | 6,846 | -739 |\n| Foreign govt./govt. agencies | 218 | 212 | -6 | 51 | 47 | -4 | 269 | 259 | -10 |\n| Municipal | 299 | 294 | -5 | 627 | 560 | -67 | 926 | 854 | -72 |\n| RMBS | 415 | 330 | -85 | 1,206 | 835 | -371 | 1,621 | 1,165 | -456 |\n| U.S. Treasuries | 343 | 341 | -2 | — | — | — | 343 | 341 | -2 |\n| Total fixed maturities | 7,670 | 7,243 | -427 | 11,174 | 9,059 | -2,044 | 18,844 | 16,302 | -2,471 |\n| Equity securities | 167 | 138 | -29 | 439 | 265 | -174 | 606 | 403 | -203 |\n| Total securities in an unrealized loss | $7,837 | $7,381 | $-456 | $11,613 | $9,324 | $-2,218 | $19,450 | $16,705 | $-2,674 |\nDecember 31, 2010\n| | December 31, 2010 |\n| | Less Than 12 Months | 12 Months or More | Total |\n| | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized |\n| | Cost | Value | Losses | Cost | Value | Losses | Cost | Value | Losses |\n| ABS | $302 | $290 | $-12 | $1,410 | $1,026 | $-384 | $1,712 | $1,316 | $-396 |\n| CDOs | 321 | 293 | -28 | 2,724 | 2,274 | -450 | 3,045 | 2,567 | -478 |\n| CMBS | 556 | 530 | -26 | 3,962 | 3,373 | -589 | 4,518 | 3,903 | -615 |\n| Corporate | 5,533 | 5,329 | -199 | 4,017 | 3,435 | -548 | 9,550 | 8,764 | -747 |\n| Foreign govt./govt. agencies | 356 | 349 | -7 | 78 | 68 | -10 | 434 | 417 | -17 |\n| Municipal | 7,485 | 7,173 | -312 | 1,046 | 863 | -183 | 8,531 | 8,036 | -495 |\n| RMBS | 1,744 | 1,702 | -42 | 1,567 | 1,147 | -420 | 3,311 | 2,849 | -462 |\n| U.S. Treasuries | 2,436 | 2,321 | -115 | 158 | 119 | -39 | 2,594 | 2,440 | -154 |\n| Total fixed maturities | 18,733 | 17,987 | -741 | 14,962 | 12,305 | -2,623 | 33,695 | 30,292 | -3,364 |\n| Equity securities | 53 | 52 | -1 | 637 | 506 | -131 | 690 | 558 | -132 |\n| Total securities in an unrealized loss | $18,786 | $18,039 | $-742 | $15,599 | $12,811 | $-2,754 | $34,385 | $30,850 | $-3,496 |\n[1] Unrealized losses exclude the change in fair value of bifurcated embedded derivative features of certain securities.\nSubsequent changes in fair value are recorded in net realized capital gains (losses).\nAs of December 31, 2011, AFS securities in an unrealized loss position, comprised of 2,549 securities, primarily related to corporate securities within the financial services sector, CMBS, and RMBS which have experienced significant price deterioration.\nAs of December 31, 2011, 75% of these securities were depressed less than 20% of cost or amortized cost.\nThe decline in unrealized losses during 2011 was primarily attributable to a decline in interest rates, partially offset by credit spread widening.\nMost of the securities depressed for twelve months or more relate to structured securities with exposure to commercial and residential real estate, as well as certain floating rate corporate securities or those securities with greater than 10 years to maturity, concentrated in the financial services sector.\nCurrent market spreads continue to be significantly wider for structured securities with exposure to commercial and residential real estate, as compared to spreads at the security’s respective purchase date, largely due to the economic and market uncertainties regarding future performance of commercial and residential real estate.\nIn addition, the majority of securities have a floating-rate coupon referenced to a market index where rates have declined substantially.\nThe Company neither has an intention to sell nor does it expect to be required to sell the securities outlined above.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
EvaluateInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.3617 |
| cosine_accuracy@3 | 0.5194 |
| cosine_accuracy@5 | 0.6092 |
| cosine_accuracy@10 | 0.7015 |
| cosine_precision@1 | 0.3617 |
| cosine_precision@3 | 0.1788 |
| cosine_precision@5 | 0.1267 |
| cosine_precision@10 | 0.0752 |
| cosine_recall@1 | 0.331 |
| cosine_recall@3 | 0.4768 |
| cosine_recall@5 | 0.5614 |
| cosine_recall@10 | 0.6548 |
| cosine_ndcg@10 | 0.496 |
| cosine_mrr@10 | 0.4668 |
| cosine_map@100 | 0.4482 |
| dot_accuracy@1 | 0.3325 |
| dot_accuracy@3 | 0.5243 |
| dot_accuracy@5 | 0.5922 |
| dot_accuracy@10 | 0.6748 |
| dot_precision@1 | 0.3325 |
| dot_precision@3 | 0.1796 |
| dot_precision@5 | 0.1248 |
| dot_precision@10 | 0.0726 |
| dot_recall@1 | 0.3059 |
| dot_recall@3 | 0.4762 |
| dot_recall@5 | 0.5446 |
| dot_recall@10 | 0.6273 |
| dot_ndcg@10 | 0.4723 |
| dot_mrr@10 | 0.4422 |
| dot_map@100 | 0.4264 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Instruct: Given a web search query, retrieve relevant passages that answer the query. |
Title: |
Instruct: Given a web search query, retrieve relevant passages that answer the query. |
Title: |
Instruct: Given a web search query, retrieve relevant passages that answer the query. |
Title: |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 2fp16: Truebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robin| Epoch | Step | Evaluate_cosine_map@100 |
|---|---|---|
| 0 | 0 | 0.2566 |
| 1.0 | 141 | 0.3931 |
| 2.0 | 282 | 0.4482 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}