Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'experienced professional skilled in excel, future, president. heavy own politics goal smile. during benefit eight beat pick allow test break. this dark why later gun.',
'data analyst needed with experience in data cleaning, power bi, sql. agreement meet coach team production concern. politics happy challenge challenge want.',
'data analyst needed with experience in sql, data cleaning, tableau. movie lead so those moment blue. outside work tree pick man fear administration strong.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.1204, -0.1657],
# [-0.1204, 1.0000, 0.8512],
# [-0.1657, 0.8512, 1.0000]])
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
proficient in problem solving, git, agile, unit testing, data structures, with mid-level experience in the field. holds a phd degree. holds certifications such as microsoft certified azure developer associate. skilled in delivering results and adapting to dynamic environments. |
as a software engineer, you will leverage your advanced programming skills to develop cutting-edge software solutions that shape the future of technology. you will work on complex coding challenges, create robust systems, and contribute to innovative projects that require deep technical knowledge and analytical skills. this role requires a keen eye for logic and problem-solving, typically suited for individuals who enjoy working independently and thrive in high-tech environments. the role is perfect for someone with a strong interest in software development, system design, and engineering principles. your work will directly impact the success of major technological products and services. |
1.0 |
proficient in policy analysis, sustainability, urban development, urban design, zoning laws, with senior-level experience in the field. holds a phd degree. holds certifications such as geographic information systems gis certificate. skilled in delivering results and adapting to dynamic environments. |
an urban planner is responsible for designing and developing land use plans and policies that promote sustainable growth and improve the quality of life in urban areas. you will analyze demographics, economic trends, and environmental factors to make recommendations for city development, zoning, and infrastructure projects. the role involves working with government agencies, architects, and developers to create plans that balance urban growth with environmental sustainability. a deep understanding of zoning laws, transportation systems, and social factors is necessary to ensure that urban spaces are functional, efficient, and equitable. |
0.0 |
proficient in critical thinking, medical terminology, patient care, surgical skills, clinical research, with mid-level experience in the field. holds a masters degree. holds certifications such as basic life support bls. skilled in delivering results and adapting to dynamic environments. |
diagnose and treat illnesses, prescribe medication, and provide ongoing patient care. work in various specialties, including surgery, pediatrics, or internal medicine. perform physical exams, order tests, and interpret medical results. collaborate with other healthcare providers to ensure comprehensive care for patients. requires medical expertise, empathy, and strong communication skills. must stay updated on the latest medical research and treatments. |
1.0 |
CosineSimilarityLoss with these parameters:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
num_train_epochs: 5multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.25 | 500 | 0.1856 |
| 0.5 | 1000 | 0.1724 |
| 0.75 | 1500 | 0.1714 |
| 1.0 | 2000 | 0.1666 |
| 1.25 | 2500 | 0.1595 |
| 1.5 | 3000 | 0.159 |
| 1.75 | 3500 | 0.1613 |
| 2.0 | 4000 | 0.157 |
| 2.25 | 4500 | 0.154 |
| 2.5 | 5000 | 0.1541 |
| 2.75 | 5500 | 0.1511 |
| 3.0 | 6000 | 0.1547 |
| 3.25 | 6500 | 0.1502 |
| 3.5 | 7000 | 0.1469 |
| 3.75 | 7500 | 0.149 |
| 4.0 | 8000 | 0.1473 |
| 4.25 | 8500 | 0.1437 |
| 4.5 | 9000 | 0.1441 |
| 4.75 | 9500 | 0.1409 |
| 5.0 | 10000 | 0.1463 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
sentence-transformers/all-MiniLM-L6-v2