Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 14
How to use B0ketto/tmp_trainer with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("B0ketto/tmp_trainer")
sentences = [
"Enforcement of minor traffic offenses leads to the discovery of more serious crimes.",
"Western culture has created independent women who are strong on their own and do not need the protection or support of their husband. This reduces the subjugation of women.",
"Philando Castile, stopped for a broken tailight, was shot seven times and killed trying to comply with the officer's request for identification.",
"The children will have several older / more mature stepmothers."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from B0ketto/tmp_trainer. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'For children, it is bad to grow up in a polygamous family.',
'Polygamous families tend to have more children.',
'This threatens the idea of true democracy.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence1, sentence2, and label| sentence1 | sentence2 | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| sentence1 | sentence2 | label |
|---|---|---|
Public opinion favors euthanasia which suggests some support for a right to die. |
Europeans generally support euthanasia. For example, more than 70% of citizens of Spain, Germany, France and Britain are in favor. |
1 |
Public opinion favors euthanasia which suggests some support for a right to die. |
In the US, support for assisted suicide has risen to 69% acceptance rate in the last few decades. |
1 |
Public opinion favors euthanasia which suggests some support for a right to die. |
The young and healthy that are asked in polls cannot imagine a situation of disability. This, so the criticism goes, blurs their image of euthanasia. |
0 |
ContrastiveLoss with these parameters:{
"distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
"margin": 0.5,
"size_average": true
}
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3.0max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss |
|---|---|---|
| 0.0609 | 500 | 0.0256 |
| 0.1218 | 1000 | 0.0257 |
| 0.1826 | 1500 | 0.0263 |
| 0.2435 | 2000 | 0.0291 |
| 0.3044 | 2500 | 0.0276 |
| 0.3653 | 3000 | 0.0304 |
| 0.4262 | 3500 | 0.0297 |
| 0.4870 | 4000 | 0.0332 |
| 0.5479 | 4500 | 0.033 |
| 0.6088 | 5000 | 0.0328 |
| 0.6697 | 5500 | 0.0328 |
| 0.7305 | 6000 | 0.0331 |
| 0.7914 | 6500 | 0.0321 |
| 0.8523 | 7000 | 0.0326 |
| 0.9132 | 7500 | 0.0329 |
| 0.9741 | 8000 | 0.0318 |
| 1.0349 | 8500 | 0.0323 |
| 1.0958 | 9000 | 0.0321 |
| 1.1567 | 9500 | 0.0321 |
| 1.2176 | 10000 | 0.0322 |
| 1.2785 | 10500 | 0.0321 |
| 1.3393 | 11000 | 0.0317 |
| 1.4002 | 11500 | 0.0317 |
| 1.4611 | 12000 | 0.0315 |
| 1.5220 | 12500 | 0.0318 |
| 1.5829 | 13000 | 0.0319 |
| 1.6437 | 13500 | 0.0315 |
| 1.7046 | 14000 | 0.0313 |
| 1.7655 | 14500 | 0.0294 |
| 1.8264 | 15000 | 0.0292 |
| 1.8873 | 15500 | 0.0278 |
| 1.9481 | 16000 | 0.0286 |
| 2.0090 | 16500 | 0.0274 |
| 2.0699 | 17000 | 0.0273 |
| 2.1308 | 17500 | 0.027 |
| 2.1916 | 18000 | 0.0271 |
| 2.2525 | 18500 | 0.0265 |
| 2.3134 | 19000 | 0.0262 |
| 2.3743 | 19500 | 0.0254 |
| 2.4352 | 20000 | 0.0255 |
| 2.4960 | 20500 | 0.0256 |
| 2.5569 | 21000 | 0.0252 |
| 2.6178 | 21500 | 0.0246 |
| 2.6787 | 22000 | 0.0251 |
| 2.7396 | 22500 | 0.0238 |
| 2.8004 | 23000 | 0.025 |
| 2.8613 | 23500 | 0.0247 |
| 2.9222 | 24000 | 0.0252 |
| 2.9831 | 24500 | 0.0237 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}
Unable to build the model tree, the base model loops to the model itself. Learn more.