Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
11
This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sucharush/bge_MNR")
# Run inference
sentences = [
'Represent this question for retrieving relevant documents: Does low 25-Hydroxyvitamin D Level be Associated with Peripheral Arterial Disease in Type 2 Diabetes Patients?',
'Patients with type 2 diabetes have an increased risk of atherosclerosis and vascular disease. Vitamin D deficiency is associated with vascular disease and is prevalent in diabetes patients. We undertook this study to determine the association between 25-hydroxyvitamin D (25[OH]D) levels and prevalence of peripheral arterial disease (PAD) in type 2 diabetes patients. A total of 1028 type 2 diabetes patients were recruited at Nanjing Medical University Affiliated Nanjing Hospital from November 2011 to October 2013. PAD was defined as an ankle-brachial index (ABI)\xa0<\xa00.9. Cardiovascular risk factors (blood pressure, HbA1c, lipid profile), comorbidities, carotid intima-media thickness (IMT) and 25(OH)D were assessed. Overall prevalence of PAD and of decreased 25(OH)D (<30\xa0ng/mL) were 20.1% (207/1028) and 54.6% (561/1028), respectively. PAD prevalence was higher in participants with decreased (23.9%) than in those with normal (15.6%) 25(OH)D (≥30\xa0ng/mL, p\xa0<0.01). Decreased 25(OH)D was associated with increased risk of PAD (odds ratio [OR], 1.69, 95% CI: 1.17-2.44, p\xa0<0.001) and PAD was significantly more likely to occur in participants ≥65\xa0years of age (OR, 2.56, 95% CI: 1.51 -4.48, vs. 1.21, 95% CI: 0.80-1.83, p-interaction\xa0=\xa00.027). After adjusting for known cardiovascular risk factors and potential confounding variables, the association of decreased 25(OH)D and PAD remained significant in patients <65\xa0years of age (OR, 1.55; 95% CI: 1.14-2.12, p\xa0=\xa00.006).',
'Based on the information provided, we only know the number of patients who died within the first year after the surgery. To determine the probability of a patient surviving at least two years, we would need additional information about the number of patients who died in the second year or survived beyond that.\n\nWithout this information, it is not possible to calculate the probability of a patient surviving at least two years after the surgery.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
ir-evalmain.LoggingEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.9241 |
| cosine_accuracy@3 | 0.9788 |
| cosine_accuracy@5 | 0.9906 |
| cosine_accuracy@10 | 0.9965 |
| cosine_precision@1 | 0.9241 |
| cosine_precision@3 | 0.3263 |
| cosine_precision@5 | 0.1981 |
| cosine_recall@1 | 0.9241 |
| cosine_recall@3 | 0.9788 |
| cosine_recall@5 | 0.9906 |
| cosine_ndcg@10 | 0.9635 |
| cosine_mrr@10 | 0.9525 |
| cosine_map@100 | 0.9526 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Represent this question for retrieving relevant documents: Are elevated levels of pro-inflammatory oxylipins in older subjects normalized by flaxseed consumption? |
Oxylipins, including eicosanoids, are highly bioactive molecules endogenously produced from polyunsaturated fatty acids. Oxylipins play a key role in chronic disease progression. It is possible, but unknown, if oxylipin concentrations change with the consumption of functional foods or differ with subject age. Therefore, in a parallel comparator trial, 20 healthy individuals were recruited into a younger (19-28years) or older (45-64years) age group (n=10/group). Participants ingested one muffin/day containing 30g of milled flaxseed (6g alpha-linolenic acid) for 4weeks. Plasma oxylipins were isolated through solid phase extraction, analyzed with HPLC-MS/MS targeted lipidomics, and quantified with the stable isotope dilution method. At baseline, the older group exhibited 13 oxylipins ≥2-fold the concentration of the younger group. Specifically, pro-inflammatory oxylipins 5-hydroxyeicosatetraenoic acid, 9,10,13-trihydroxyoctadecenoic acid, and 9,12,13-trihydroxyoctadecenoic acid were signi... |
Represent this question for retrieving relevant documents: Find the isometries of the metric $ds^2 = dx^2 + dy^2$ over the rectangle $R=[0,a] \times [0,b]$, subject to the additional condition that any isometry $f$ maps $(0,0)$ to $(x_0, y_0)$. Find $x_0$ and $y_0$ such that the isometry $f$ is given by $f(x,y) = (x_0 + x, y_0 - y)$. |
An isometry is a transformation that preserves the distance between points. In this case, we are looking for transformations that preserve the metric $ds^2 = dx^2 + dy^2$. Let's consider the transformation $f(x,y) = (x_0 + x, y_0 - y)$ and find the conditions on $x_0$ and $y_0$ for it to be an isometry. |
Represent this question for retrieving relevant documents: Do two di-leucine motifs regulate trafficking and function of mouse ASIC2a? |
Acid-sensing ion channels (ASICs) are proton-gated cation channels that mediate acid-induced responses in neurons. ASICs are important for mechanosensation, learning and memory, fear, pain, and neuronal injury. ASIC2a is widely expressed in the nervous system and modulates ASIC channel trafficking and activity in both central and peripheral systems. Here, to better understand mechanisms regulating ASIC2a, we searched for potential protein motifs that regulate ASIC2a trafficking. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 1batch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | ir-eval_cosine_ndcg@10 |
|---|---|---|---|
| 0.1631 | 500 | 0.021 | 0.9523 |
| 0.3262 | 1000 | 0.0069 | 0.9600 |
| 0.4892 | 1500 | 0.0051 | 0.9593 |
| 0.6523 | 2000 | 0.0055 | 0.9605 |
| 0.8154 | 2500 | 0.0053 | 0.9638 |
| 0.9785 | 3000 | 0.0056 | 0.9634 |
| 1.0 | 3066 | - | 0.9635 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
BAAI/bge-small-en-v1.5