Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
9
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("GaniduA/bge-finetuned-olscience")
# Run inference
sentences = [
'Discuss the principles and process of electrolysis, including the conventions adopted in electrolysis.',
'The development of artificial intelligence has significantly impacted the tech industry, leading to advancements in machine learning and natural language processing.',
"In the movie 'Inception', directed by Christopher Nolan, the plot revolves around a skilled thief who is given a chance at redemption if he can successfully perform inception by planting an idea into someone's subconscious.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
evalBinaryClassificationEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 1.0 |
| cosine_accuracy_threshold | 0.0571 |
| cosine_f1 | 1.0 |
| cosine_f1_threshold | 0.0571 |
| cosine_precision | 1.0 |
| cosine_recall | 1.0 |
| cosine_ap | 1.0 |
| cosine_mcc | 1.0 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
How does the reaction of zinc with copper sulfate demonstrate a single displacement reaction? |
Julius Caesar crossed the Rubicon River in 49 BC, which led to a chain of events culminating in the Roman Civil War. |
0.0 |
How do you investigate the effect of tightening a screw on the moment of force required to rotate a stick? |
Explore the depths of the ocean with a team of deep-sea divers searching for mythical sea creatures and undiscovered shipwrecks. |
0.0 |
Describe the operation of a photodiode in optical sensing. |
A photodiode converts light into an electrical current by generating electron-hole pairs when exposed to light, used in optical sensing and communication applications. |
1.0 |
CosineSimilarityLoss with these parameters:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 64num_train_epochs: 2fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | eval_cosine_ap |
|---|---|---|---|
| 0.0366 | 20 | - | 0.9892 |
| 0.0731 | 40 | - | 0.9978 |
| 0.1097 | 60 | - | 0.9989 |
| 0.1463 | 80 | - | 0.9997 |
| 0.1828 | 100 | - | 0.9999 |
| 0.2194 | 120 | - | 0.9998 |
| 0.2559 | 140 | - | 0.9998 |
| 0.2925 | 160 | - | 0.9998 |
| 0.3291 | 180 | - | 0.9998 |
| 0.3656 | 200 | - | 0.9999 |
| 0.4022 | 220 | - | 0.9998 |
| 0.4388 | 240 | - | 0.9999 |
| 0.4753 | 260 | - | 1.0000 |
| 0.5119 | 280 | - | 1.0000 |
| 0.5484 | 300 | - | 1.0000 |
| 0.5850 | 320 | - | 1.0000 |
| 0.6216 | 340 | - | 1.0000 |
| 0.6581 | 360 | - | 1.0000 |
| 0.6947 | 380 | - | 1.0 |
| 0.7313 | 400 | - | 1.0000 |
| 0.7678 | 420 | - | 1.0 |
| 0.8044 | 440 | - | 1.0 |
| 0.8410 | 460 | - | 1.0000 |
| 0.8775 | 480 | - | 1.0 |
| 0.9141 | 500 | 0.0199 | 1.0000 |
| 0.9506 | 520 | - | 1.0 |
| 0.9872 | 540 | - | 1.0000 |
| 1.0 | 547 | - | 1.0000 |
| 1.0238 | 560 | - | 1.0000 |
| 1.0603 | 580 | - | 1.0000 |
| 1.0969 | 600 | - | 1.0000 |
| 1.1335 | 620 | - | 1.0000 |
| 1.1700 | 640 | - | 1.0 |
| 1.2066 | 660 | - | 1.0000 |
| 1.2431 | 680 | - | 1.0000 |
| 1.2797 | 700 | - | 1.0000 |
| 1.3163 | 720 | - | 1.0000 |
| 1.3528 | 740 | - | 1.0000 |
| 1.3894 | 760 | - | 1.0 |
| 1.4260 | 780 | - | 1.0 |
| 1.4625 | 800 | - | 1.0000 |
| 1.4991 | 820 | - | 1.0 |
| 1.5356 | 840 | - | 1.0000 |
| 1.5722 | 860 | - | 1.0000 |
| 1.6088 | 880 | - | 1.0 |
| 1.6453 | 900 | - | 1.0 |
| 1.6819 | 920 | - | 1.0 |
| 1.7185 | 940 | - | 1.0000 |
| 1.7550 | 960 | - | 1.0000 |
| 1.7916 | 980 | - | 1.0000 |
| 1.8282 | 1000 | 0.0012 | 1.0000 |
| 1.8647 | 1020 | - | 1.0 |
| 1.9013 | 1040 | - | 1.0 |
| 1.9378 | 1060 | - | 1.0 |
| 1.9744 | 1080 | - | 1.0 |
| 2.0 | 1094 | - | 1.0 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
BAAI/bge-base-en-v1.5