Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
12
This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("rjnClarke/bgem3-shakespeare_st_3")
# Run inference
sentences = [
'King Henry V is preparing for an expedition to France to seek revenge on the Dauphin for mocking him, and he urges his lords to quickly gather resources and support for the impending war.',
"That shall fly with them; for many a thousand widows\n Shall this his mock mock of their dear husbands; Mock mothers from their sons, mock castles down; And some are yet ungotten and unborn That shall have cause to curse the Dauphin's scorn. But this lies all within the will of God, To whom I do appeal; and in whose name, Tell you the Dauphin, I am coming on, To venge me as I may and to put forth My rightful hand in a well-hallow'd cause. So get you hence in peace; and tell the Dauphin His jest will savour but of shallow wit, When thousands weep more than did laugh at it. Convey them with safe conduct. Fare you well. Exeunt AMBASSADORS EXETER. This was a merry message. KING HENRY. We hope to make the sender blush at it. Therefore, my lords, omit no happy hour That may give furth'rance to our expedition; For we have now no thought in us but France, Save those to God, that run before our business. Therefore let our proportions for these wars Be soon collected, and all things thought upon That may with reasonable swiftness ad More feathers to our wings; for, God before, We'll chide this Dauphin at his father's door. Therefore let every man now task his thought That this fair action may on foot be brought. Exeunt\n",
"And that great minds, of partial indulgence\n To their benumbed wills, resist the same; There is a law in each well-order'd nation To curb those raging appetites that are Most disobedient and refractory. If Helen, then, be wife to Sparta's king- As it is known she is-these moral laws Of nature and of nations speak aloud To have her back return'd. Thus to persist In doing wrong extenuates not wrong, But makes it much more heavy. Hector's opinion Is this, in way of truth. Yet, ne'er the less, My spritely brethren, I propend to you In resolution to keep Helen still; For 'tis a cause that hath no mean dependence Upon our joint and several dignities. TROILUS. Why, there you touch'd the life of our design. Were it not glory that we more affected Than the performance of our heaving spleens, I would not wish a drop of Troyan blood Spent more in her defence. But, worthy Hector, She is a theme of honour and renown, A spur to valiant and magnanimous deeds, Whose present courage may beat down our foes, And fame in time to come canonize us; For I presume brave Hector would not lose So rich advantage of a promis'd glory As smiles upon the forehead of this action For the wide world's revenue. HECTOR. I am yours, You valiant offspring of great Priamus. I have a roisting challenge sent amongst The dull and factious nobles of the Greeks Will strike amazement to their drowsy spirits. I was advertis'd their great general slept,\n Whilst emulation in the army crept.\n This, I presume, will wake him. Exeunt\n",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.3823 |
| cosine_accuracy@3 | 0.5235 |
| cosine_accuracy@5 | 0.5825 |
| cosine_accuracy@10 | 0.6564 |
| cosine_precision@1 | 0.3823 |
| cosine_precision@3 | 0.1745 |
| cosine_precision@5 | 0.1165 |
| cosine_precision@10 | 0.0656 |
| cosine_recall@1 | 0.3823 |
| cosine_recall@3 | 0.5235 |
| cosine_recall@5 | 0.5825 |
| cosine_recall@10 | 0.6564 |
| cosine_ndcg@10 | 0.5142 |
| cosine_mrr@10 | 0.4694 |
| cosine_map@100 | 0.4766 |
| dot_accuracy@1 | 0.3823 |
| dot_accuracy@3 | 0.5235 |
| dot_accuracy@5 | 0.5825 |
| dot_accuracy@10 | 0.6564 |
| dot_precision@1 | 0.3823 |
| dot_precision@3 | 0.1745 |
| dot_precision@5 | 0.1165 |
| dot_precision@10 | 0.0656 |
| dot_recall@1 | 0.3823 |
| dot_recall@3 | 0.5235 |
| dot_recall@5 | 0.5825 |
| dot_recall@10 | 0.6564 |
| dot_ndcg@10 | 0.5142 |
| dot_mrr@10 | 0.4694 |
| dot_map@100 | 0.4766 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Who is trying to convince Coriolanus to have mercy on Rome and its citizens? |
Enter CORIOLANUS with AUFIDIUS CORIOLANUS. What's the matter? |
The English nobility receive sad tidings of losses in France and the need for action. |
Sad tidings bring I to you out of France, |
What are the main locations where the characters are headed for battle? |
I may dispose of him. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
batch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | cosine_map@100 |
|---|---|---|---|
| 0.3864 | 500 | 0.5974 | - |
| 0.7728 | 1000 | 0.5049 | - |
| 1.0 | 1294 | - | 0.4475 |
| 1.1592 | 1500 | 0.4202 | - |
| 1.5456 | 2000 | 0.2689 | - |
| 1.9320 | 2500 | 0.2452 | - |
| 2.0 | 2588 | - | 0.4758 |
| 2.3184 | 3000 | 0.17 | - |
| 2.7048 | 3500 | 0.1301 | - |
| 3.0 | 3882 | - | 0.4766 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
BAAI/bge-m3