Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use swardiantara/bert-tiny-sst5-full-fixed-cosine with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("swardiantara/bert-tiny-sst5-full-fixed-cosine")
sentences = [
"a stirring , funny and finally transporting re-imagining of beauty and the beast and 1930s horror films",
"like mike is a slight and uninventive movie : like the exalted michael jordan referred to in the title , many can aspire but none can equal .",
"i 've had more interesting -- and , dare i say , thematically complex -- bowel movements than this long-on-the-shelf , point-and-shoot exercise in gimmicky crime drama .",
"the name says it all ."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from google/bert_uncased_L-2_H-128_A-2 on the generator dataset. It maps sentences & paragraphs to a 128-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 128, 'pooling_mode': 'mean', 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("swardiantara/bert-tiny-sst5-full-fixed-cosine")
# Run inference
sentences = [
'a stirring , funny and finally transporting re-imagining of beauty and the beast and 1930s horror films',
"... feels as if -lrb- there 's -rrb- a choke leash around your neck so director nick cassavetes can give it a good , hard yank whenever he wants you to feel something .",
"what with the incessant lounge music playing in the film 's background , you may mistake love liza for an adam sandler chanukah song .",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 128]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.2197, 0.2653],
# [0.2197, 1.0000, 0.2309],
# [0.2653, 0.2309, 1.0000]])
text_a, text_b, and label| text_a | text_b | label | |
|---|---|---|---|
| type | string | string | list |
| modality | text | text | |
| details |
|
|
|
| text_a | text_b | label |
|---|---|---|
a stirring , funny and finally transporting re-imagining of beauty and the beast and 1930s horror films |
apparently reassembled from the cutting-room floor of any given daytime soap . |
[0.0, 0.75] |
a stirring , funny and finally transporting re-imagining of beauty and the beast and 1930s horror films |
they presume their audience wo n't sit still for a sociology lesson , however entertainingly presented , so they trot out the conventional science-fiction elements of bug-eyed monsters and futuristic women in skimpy clothes . |
[0.0, 0.75] |
a stirring , funny and finally transporting re-imagining of beauty and the beast and 1930s horror films |
the entire movie is filled with deja vu moments . |
[0.0, 0.5] |
main.OrdinalProxyContrastiveLossper_device_train_batch_size: 1024num_train_epochs: 10learning_rate: 2e-05load_best_model_at_end: Trueper_device_train_batch_size: 1024num_train_epochs: 10max_steps: -1learning_rate: 2e-05lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_steps: 0optim: adamw_torchoptim_args: Noneweight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08optim_target_modules: Nonegradient_accumulation_steps: 1average_tokens_across_devices: Truemax_grad_norm: 1.0label_smoothing_factor: 0.0bf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Nonetorch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneuse_liger_kernel: Falseliger_kernel_config: Noneuse_cache: Falseneftune_noise_alpha: Nonetorch_empty_cache_steps: Noneauto_find_batch_size: Falselog_on_each_node: Truelogging_nan_inf_filter: Trueinclude_num_input_tokens_seen: nolog_level: passivelog_level_replica: warningdisable_tqdm: Falseproject: huggingfacetrackio_space_id: Nonetrackio_bucket_id: Nonetrackio_static_space_id: Noneper_device_eval_batch_size: 8prediction_loss_only: Trueeval_on_start: Falseeval_do_concat_batches: Trueeval_use_gather_object: Falseeval_accumulation_steps: Noneinclude_for_metrics: []batch_eval_metrics: Falsesave_only_model: Falsesave_on_each_node: Falseenable_jit_checkpoint: Falsepush_to_hub: Falsehub_private_repo: Nonehub_model_id: Nonehub_strategy: every_savehub_always_push: Falsehub_revision: Noneload_best_model_at_end: Trueignore_data_skip: Falserestore_callback_states_from_checkpoint: Falsefull_determinism: Falseseed: 42data_seed: Noneuse_cpu: Falseaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedataloader_drop_last: Falsedataloader_num_workers: 0dataloader_pin_memory: Truedataloader_persistent_workers: Falsedataloader_prefetch_factor: Noneremove_unused_columns: Truelabel_names: Nonetrain_sampling_strategy: randomlength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falseddp_static_graph: Noneddp_backend: Noneddp_timeout: 1800fsdp: Nonefsdp_config: Nonedeepspeed: Nonedebug: []skip_memory_metrics: Truedo_predict: Falseresume_from_checkpoint: Nonewarmup_ratio: Nonelocal_rank: -1prompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0140 | 500 | 0.0465 |
| 0.0281 | 1000 | 0.0441 |
| 0.0421 | 1500 | 0.0425 |
| 0.0561 | 2000 | 0.0409 |
| 0.0701 | 2500 | 0.0389 |
| 0.0842 | 3000 | 0.0367 |
| 0.0982 | 3500 | 0.0345 |
| 0.1122 | 4000 | 0.0327 |
| 0.1263 | 4500 | 0.0307 |
| 0.1403 | 5000 | 0.0281 |
| 0.1543 | 5500 | 0.0242 |
| 0.1683 | 6000 | 0.0199 |
| 0.1824 | 6500 | 0.0162 |
| 0.1964 | 7000 | 0.0131 |
| 0.2104 | 7500 | 0.0106 |
| 0.2245 | 8000 | 0.0084 |
| 0.2385 | 8500 | 0.0067 |
| 0.2525 | 9000 | 0.0053 |
| 0.2665 | 9500 | 0.0042 |
| 0.2806 | 10000 | 0.0034 |
| 0.2946 | 10500 | 0.0028 |
| 0.3086 | 11000 | 0.0023 |
| 0.3227 | 11500 | 0.0019 |
| 0.3367 | 12000 | 0.0017 |
| 0.3507 | 12500 | 0.0014 |
| 0.3647 | 13000 | 0.0012 |
| 0.3788 | 13500 | 0.0011 |
| 0.3928 | 14000 | 0.0010 |
| 0.4068 | 14500 | 0.0008 |
| 0.4209 | 15000 | 0.0007 |
| 0.4349 | 15500 | 0.0006 |
| 0.4489 | 16000 | 0.0006 |
| 0.4629 | 16500 | 0.0005 |
| 0.4770 | 17000 | 0.0005 |
| 0.4910 | 17500 | 0.0004 |
| 0.5050 | 18000 | 0.0004 |
| 0.5191 | 18500 | 0.0004 |
| 0.5331 | 19000 | 0.0003 |
| 0.5471 | 19500 | 0.0003 |
| 0.5612 | 20000 | 0.0003 |
| 0.5752 | 20500 | 0.0003 |
| 0.5892 | 21000 | 0.0002 |
| 0.6032 | 21500 | 0.0002 |
| 0.6173 | 22000 | 0.0002 |
| 0.6313 | 22500 | 0.0002 |
| 0.6453 | 23000 | 0.0002 |
| 0.6594 | 23500 | 0.0002 |
| 0.6734 | 24000 | 0.0002 |
| 0.6874 | 24500 | 0.0001 |
| 0.7014 | 25000 | 0.0001 |
| 0.7155 | 25500 | 0.0001 |
| 0.7295 | 26000 | 0.0001 |
| 0.7435 | 26500 | 0.0001 |
| 0.7576 | 27000 | 0.0001 |
| 0.7716 | 27500 | 0.0001 |
| 0.7856 | 28000 | 0.0001 |
| 0.7996 | 28500 | 0.0001 |
| 0.8137 | 29000 | 0.0001 |
| 0.8277 | 29500 | 0.0001 |
| 0.8417 | 30000 | 0.0001 |
| 0.8558 | 30500 | 0.0001 |
| 0.8698 | 31000 | 0.0001 |
| 0.8838 | 31500 | 0.0001 |
| 0.8978 | 32000 | 0.0001 |
| 0.9119 | 32500 | 0.0001 |
| 0.9259 | 33000 | 0.0001 |
| 0.9399 | 33500 | 0.0001 |
| 0.9540 | 34000 | 0.0001 |
| 0.9680 | 34500 | 0.0001 |
| 0.9820 | 35000 | 0.0001 |
| 0.9960 | 35500 | 0.0000 |
| 1.0 | 35641 | - |
| 1.0101 | 36000 | 0.0000 |
| 1.0241 | 36500 | 0.0000 |
| 1.0381 | 37000 | 0.0000 |
| 1.0522 | 37500 | 0.0000 |
| 1.0662 | 38000 | 0.0000 |
| 1.0802 | 38500 | 0.0000 |
| 1.0942 | 39000 | 0.0000 |
| 1.1083 | 39500 | 0.0000 |
| 1.1223 | 40000 | 0.0000 |
| 1.1363 | 40500 | 0.0000 |
| 1.1504 | 41000 | 0.0000 |
| 1.1644 | 41500 | 0.0000 |
| 1.1784 | 42000 | 0.0000 |
| 1.1924 | 42500 | 0.0000 |
| 1.2065 | 43000 | 0.0000 |
| 1.2205 | 43500 | 0.0000 |
| 1.2345 | 44000 | 0.0000 |
| 1.2486 | 44500 | 0.0000 |
| 1.2626 | 45000 | 0.0000 |
| 1.2766 | 45500 | 0.0000 |
| 1.2906 | 46000 | 0.0000 |
| 1.3047 | 46500 | 0.0000 |
| 1.3187 | 47000 | 0.0000 |
| 1.3327 | 47500 | 0.0000 |
| 1.3468 | 48000 | 0.0000 |
| 1.3608 | 48500 | 0.0000 |
| 1.3748 | 49000 | 0.0000 |
| 1.3888 | 49500 | 0.0000 |
| 1.4029 | 50000 | 0.0000 |
| 1.4169 | 50500 | 0.0000 |
| 1.4309 | 51000 | 0.0000 |
| 1.4450 | 51500 | 0.0000 |
| 1.4590 | 52000 | 0.0000 |
| 1.4730 | 52500 | 0.0000 |
| 1.4871 | 53000 | 0.0000 |
| 1.5011 | 53500 | 0.0000 |
| 1.5151 | 54000 | 0.0000 |
| 1.5291 | 54500 | 0.0000 |
| 1.5432 | 55000 | 0.0000 |
| 1.5572 | 55500 | 0.0000 |
| 1.5712 | 56000 | 0.0000 |
| 1.5853 | 56500 | 0.0000 |
| 1.5993 | 57000 | 0.0000 |
| 1.6133 | 57500 | 0.0000 |
| 1.6273 | 58000 | 0.0000 |
| 1.6414 | 58500 | 0.0000 |
| 1.6554 | 59000 | 0.0000 |
| 1.6694 | 59500 | 0.0000 |
| 1.6835 | 60000 | 0.0000 |
| 1.6975 | 60500 | 0.0000 |
| 1.7115 | 61000 | 0.0000 |
| 1.7255 | 61500 | 0.0000 |
| 1.7396 | 62000 | 0.0000 |
| 1.7536 | 62500 | 0.0000 |
| 1.7676 | 63000 | 0.0000 |
| 1.7817 | 63500 | 0.0000 |
| 1.7957 | 64000 | 0.0000 |
| 1.8097 | 64500 | 0.0000 |
| 1.8237 | 65000 | 0.0000 |
| 1.8378 | 65500 | 0.0000 |
| 1.8518 | 66000 | 0.0000 |
| 1.8658 | 66500 | 0.0000 |
| 1.8799 | 67000 | 0.0000 |
| 1.8939 | 67500 | 0.0000 |
| 1.9079 | 68000 | 0.0000 |
| 1.9219 | 68500 | 0.0000 |
| 1.9360 | 69000 | 0.0000 |
| 1.9500 | 69500 | 0.0000 |
| 1.9640 | 70000 | 0.0000 |
| 1.9781 | 70500 | 0.0000 |
| 1.9921 | 71000 | 0.0000 |
| 2.0 | 71282 | - |
| 2.0061 | 71500 | 0.0000 |
| 2.0201 | 72000 | 0.0000 |
| 2.0342 | 72500 | 0.0000 |
| 2.0482 | 73000 | 0.0000 |
| 2.0622 | 73500 | 0.0000 |
| 2.0763 | 74000 | 0.0000 |
| 2.0903 | 74500 | 0.0000 |
| 2.1043 | 75000 | 0.0000 |
| 2.1183 | 75500 | 0.0000 |
| 2.1324 | 76000 | 0.0000 |
| 2.1464 | 76500 | 0.0000 |
| 2.1604 | 77000 | 0.0000 |
| 2.1745 | 77500 | 0.0000 |
| 2.1885 | 78000 | 0.0000 |
| 2.2025 | 78500 | 0.0000 |
| 2.2165 | 79000 | 0.0000 |
| 2.2306 | 79500 | 0.0000 |
| 2.2446 | 80000 | 0.0000 |
| 2.2586 | 80500 | 0.0000 |
| 2.2727 | 81000 | 0.0000 |
| 2.2867 | 81500 | 0.0000 |
| 2.3007 | 82000 | 0.0000 |
| 2.3147 | 82500 | 0.0000 |
| 2.3288 | 83000 | 0.0000 |
| 2.3428 | 83500 | 0.0000 |
| 2.3568 | 84000 | 0.0000 |
| 2.3709 | 84500 | 0.0000 |
| 2.3849 | 85000 | 0.0000 |
| 2.3989 | 85500 | 0.0000 |
| 2.4130 | 86000 | 0.0000 |
| 2.4270 | 86500 | 0.0000 |
| 2.4410 | 87000 | 0.0000 |
| 2.4550 | 87500 | 0.0000 |
| 2.4691 | 88000 | 0.0000 |
| 2.4831 | 88500 | 0.0000 |
| 2.4971 | 89000 | 0.0000 |
| 2.5112 | 89500 | 0.0000 |
| 2.5252 | 90000 | 0.0000 |
| 2.5392 | 90500 | 0.0000 |
| 2.5532 | 91000 | 0.0000 |
| 2.5673 | 91500 | 0.0000 |
| 2.5813 | 92000 | 0.0000 |
| 2.5953 | 92500 | 0.0000 |
| 2.6094 | 93000 | 0.0000 |
| 2.6234 | 93500 | 0.0000 |
| 2.6374 | 94000 | 0.0000 |
| 2.6514 | 94500 | 0.0000 |
| 2.6655 | 95000 | 0.0000 |
| 2.6795 | 95500 | 0.0000 |
| 2.6935 | 96000 | 0.0000 |
| 2.7076 | 96500 | 0.0000 |
| 2.7216 | 97000 | 0.0000 |
| 2.7356 | 97500 | 0.0000 |
| 2.7496 | 98000 | 0.0000 |
| 2.7637 | 98500 | 0.0000 |
| 2.7777 | 99000 | 0.0000 |
| 2.7917 | 99500 | 0.0000 |
| 2.8058 | 100000 | 0.0000 |
| 2.8198 | 100500 | 0.0000 |
| 2.8338 | 101000 | 0.0000 |
| 2.8478 | 101500 | 0.0000 |
| 2.8619 | 102000 | 0.0000 |
| 2.8759 | 102500 | 0.0000 |
| 2.8899 | 103000 | 0.0000 |
| 2.9040 | 103500 | 0.0000 |
| 2.9180 | 104000 | 0.0000 |
| 2.9320 | 104500 | 0.0000 |
| 2.9460 | 105000 | 0.0000 |
| 2.9601 | 105500 | 0.0000 |
| 2.9741 | 106000 | 0.0000 |
| 2.9881 | 106500 | 0.0000 |
| 3.0 | 106923 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
google/bert_uncased_L-2_H-128_A-2