Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Chandar/sv-subject-based-all-MiniLM-L6-v2")
# Run inference
sentences = [
'with\tthe\torigin\tof\tthe\tcoal\tformed\tduring\tthe\tcarboniferous\tepoch,\ttwo\tor\tthree\nconsiderations\tsuggest\tthemselves.\nIn\tthe\tfirst\tplace,\tthe\tgreat\tphantom\tof\tgeological\ttime\trises\tbefore\tthe\tstudent\tof\nthis,\tas\tof\tall\tother,\tfragments\tof\tthe\thistory\tof\tour\tearth—\tspringing\nirrepressibly\tout\tof\tthe\tfacts,\tlike\tthe\tDjin\tfrom\tthe\tjar\twhich\tthe\tfishermen\tso\nincautiously\topened;\tand\tlike\tthe\tDjin\tagain,\tbeing\tvaporous,\tshifting,\tand\nindefinable,\tbut\tunmistakably\tgigantic.\tHowever\tmodest\tthe\tbases\tof\tone\'s\ncalculation\tmay\tbe,\tthe\tminimum\tof\ttime\tassignable\tto\tthe\tcoal\tperiod\tremains\nsomething\tstupendous.\nPrincipal\tDawson\tis\tthe\tlast\tperson\tlikely\tto\tbe\tguilty\tof\texaggeration\tin\tthis\nmatter,\tand\tit\twill\tbe\twell\tto\tconsider\twhat\the\thas\tto\tsay\tabout\tit:—\n"The\trate\tof\taccumulation\tof\tcoal\twas\tvery\tslow.\tThe\tclimate\tof\tthe\tperiod,\tin\nthe\tnorthern\ttemperate\tzone,\twas\tof\tsuch\ta\tcharacter\tthat\tthe\ttrue\tconifers\tshow\nrings\tof\tgrowth,\tnot\tlarger,\tnor\tmuch\tless\tdistinct,\tthan\tthose\tof\tmany\tof\ttheir\nmodern\tcongeners.\tThe\t\nSigillarioe\n\tand\t\nCalamites\n\twere\tnot,\tas\toften\tsupposed,\ncomposed\twholly,\tor\teven\tprincipally,\tof\tlax\tand\tsoft\ttissues,\tor\tnecessarily\nshort-lived.\tThe\tformer\thad,\tit\tis\ttrue,\ta\tvery\tthick\tinner\tbark;\tbut\ttheir\tdense\nwoody\taxis,\ttheir\tthick\tand\tnearly\timperishable\touter\tbark,\tand\ttheir\tscanty\tand\nrigid\tfoliage,\twould\tindicate\tno\tvery\trapid\tgrowth\tor\tdecay.\tIn\tthe\tcase\tof\tthe\nSigillarioe\n,\tthe\tvariations\tin\tthe\tleaf-scars\tin\tdifferent\tparts\tof\tthe\ttrunk,\tthe\nintercalation\tof\tnew\tridges\tat\tthe\tsurface\trepresenting\tthat\tof\tnew\twoody\twedges\nin\tthe\taxis,\tthe\ttransverse\tmarks\tleft\tby\tthe\tstages\tof\tupward\tgrowth,\tall\tindicate\nthat\tseveral\tyears\tmust\thave\tbeen\trequired\tfor\tthe\tgrowth\tof\tstems\tof\tmoderate\nsize.\tThe\tenormous\troots\tof\tthese\ttrees,\tand\tthe\tcondition\tof\tthe\tcoal-swamps,\nmust\thave\texempted\tthem\tfrom\tthe\tdanger\tof\tbeing\toverthrown\tby\tviolence.\nThey\tprobably\tfell\tin\tsuccessive\tgenerations\tfrom\tnatural\tdecay;\tand\tmaking\nevery\tallowance\tfor\tother\tmaterials,\twe\tmay\tsafely\tassert\tthat\tevery\tfoot\tof\nthickness\tof\tpure\tbituminous\tcoal\timplies\tthe\tquiet\tgrowth\tand\tfall\tof\tat\tleast\nfifty\tgenerations\tof\t\nSigillarioe\n,\tand\ttherefore\tan\tundisturbed\tcondition\tof\tforest\ngrowth\tenduring\tthrough\tmany\tcenturies.\tFurther,\tthere\tis\tevidence\tthat\tan\nimmense\tamount\tof\tloose\tparenchymatous\ttissue,\tand\teven\tof\twood,\tperished\tby\ndecay,\tand\twe\tdo\tnot\tknow\tto\twhat\textent\teven\tthe\tmost\tdurable\ttissues\tmay\nhave\tdisappeared\tin\tthis\tway;\tso\tthat,\tin\tmany\tcoal-seams,\twe\tmay\thave\tonly\ta\nvery\tsmall\tpart\tof\tthe\tvegetable\tmatter\tproduced."\nUndoubtedly\tthe\tforce\tof\tthese\treflections\tis\tnot\tdiminished\twhen\tthe',
'31 \n \n \n2.Chapter Two:………………………………………………………….. Causes of Aging \n \n \n There are many types of free radicals and the most related to the \nbiological process are those which derived from oxygen: the Reactive \nOxygen Species ( ROS ). These ROS include superoxide anion , peroxide \nand hydro radicals . ROS are produced in vivo within the mitochondria \nduring electron tra nsport chain. They are also produced as intermediate \nproducts in different enzymatic reactions and by different physiological \nprocesses such as: \n\uf0a7 Phagocytic activity of white blood cells, specifically neu trophils. \nNeutrophils generate ROS during phagocytic activity in order to kill the \ninvading pathogens as a host defense mechanism. \n \n\uf0a7 When the cells are exposed to abnormal conditions -such as hypoxia \nor hperoxia -produce ROS. Some drugs have the ability to induce the \ncells to produce ROS due to their oxidizing effect. \n \n\uf0a7 An exposure to radiation may induce the biological systems to \nproduce ROS.',
'THEINHERITANCEOFDURATION177\npreferredtostatetheconclusionintermsofdeath,rates,\nasitwasoriginallystatedbyPearson,becauseofthe\nbearingithasuponagreatdealofthepublichealth\npropagandasolooselyflungabout.Itneedonlybere-\nmemberedthatthereisaperfectlydefinitefunctional\nrelationbetweendeathrateandaveragedurationoflife\ninanapproximatelystablepopulationgroup,expres-\nsiblebyanequation,inordertoseethatanyconclusion\nastotherelativeinfluenceofheredityandenvironment\nuponthegeneraldeathratemustapplywithequalforce\ntothedurationoflife.\nTHESELECTIVEDEATHBATEINMAN\nIfthedurationo; lifewereinheriteditwouldlogical-\nlybeexpectedthatsomeportionofthedeathratemust\nbeselectiveincharacter.Forinheritanceofduration\noflifecanonlymeanthatwhenapersondiesisinpart\ndeterminedbythatindividual\'sbiologicalconstitutionor\nmakeup.Andequallyitisobviousthatindividualsof\nweakandunsoundconstitutionmust,ontheaverage,\ndieearlierthanthoseofstrong,sound,andvigorouscon-\nstitution."Whenceitfollowsthatthechancesofleaving\noffspringwillbegreaterforthoseofsoundconstitution\nthanfortheweaklings.Themathematicaldiscussion\nwhichhasjustbeengivenindicatesthatfromone-half\ntothree-fourthsofthedeathrateisselectiveinchar-\nacter,becausethatproportionisdeterminedbyhereditary\nfactors.Justinproportionashereditydetermines\nthedeathrate,soisthemortalityselective.Therealityof\nthefactofaselectivedeathrateinmancanbeeasily\nshowngraphically.\nInFigure44areseenthegraphsofsomedatafrom\nEuropeanroyalfamilies,wherenoneglectofchildren,\n12',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
BinaryClassificationEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.7934 |
| cosine_accuracy_threshold | 0.1476 |
| cosine_f1 | 0.6493 |
| cosine_f1_threshold | 0.1067 |
| cosine_precision | 0.6543 |
| cosine_recall | 0.6444 |
| cosine_ap | 0.7384 |
| cosine_mcc | 0.4793 |
sentence1, sentence2, and label| sentence1 | sentence2 | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| sentence1 | sentence2 | label |
|---|---|---|
with the origin of the coal formed during the carboniferous epoch, two or three |
organic coenzymes to catalyze its specific chemical reaction. Therefore, enzyme function is, in part, |
1 |
with the origin of the coal formed during the carboniferous epoch, two or three |
Infertility |
1 |
with the origin of the coal formed during the carboniferous epoch, two or three |
Figure 18.13The honeycreeper birds illustrate adaptive radiation. From one original species of bird, |
1 |
CoSENTLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "pairwise_cos_sim"
}
per_device_train_batch_size: 16per_device_eval_batch_size: 32learning_rate: 2e-05weight_decay: 0.01max_steps: 2000overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3.0max_steps: 2000lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | cosine_ap |
|---|---|---|---|
| -1 | -1 | - | 0.7384 |
| 0.0071 | 500 | 0.5524 | - |
| 0.0142 | 1000 | 0.0016 | - |
| 0.0213 | 1500 | 0.0004 | - |
| 0.0285 | 2000 | 0.0001 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}
Base model
sentence-transformers/all-MiniLM-L6-v2