Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("along26/all-MiniLM-L6-v2_multilingual_malaysian-v9")
# Run inference
sentences = [
'According to Bintulu police chief ACP Zailanni Amit, the late Bermau Bagu, 70, was in the garden before being shot by the suspect, his brother-in-law, who was also hunting in the garden.',
'Nitih ku Ketuai Polis Pelilih Bintulu, ACP Zailanni Amit, rambau penusah nya nyadi, niang ti benama Bermau Bagu, 70 taun benung ba kebun nya sebedau kena timbak suspek, ipar niang empu, ke bela ngasu dalam kandang kebun nya.',
'What is finance generally?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.1941, 0.9593],
# [0.1941, 1.0000, 0.1978],
# [0.9593, 0.1978, 1.0000]])
sentence_0, sentence_1, and sentence_2| sentence_0 | sentence_1 | sentence_2 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| sentence_0 | sentence_1 | sentence_2 |
|---|---|---|
Why have some analysts suggested that the outcome of Najib Razak's corruption trial could have significant implications for Malaysia's political landscape and the future of its democracy? |
Mengapa sesetengah penganalisis mencadangkan bahawa keputusan perbicaraan rasuah Najib Razak boleh memberi implikasi yang besar kepada landskap politik Malaysia dan masa depan demokrasinya? |
A warming climate can significantly affect the emergence times of certain insect species, as many insects are ectothermic and rely on external environmental conditions to regulate their body temperature. This means that their development, reproduction, and behavior are closely linked to temperature and other climatic factors. As global temperatures rise, these changes can lead to shifts in the timing of insect emergence, which can have cascading effects on ecosystem interactions and services. |
Corruption can have a significant impact on economic development and social inequality in Malaysia. |
Rasuah boleh memberi kesan yang ketara kepada pembangunan ekonomi dan ketidaksamaan sosial di Malaysia. |
"What are the specific mechanisms through which immunoglobulins act to neutralize antigens and prevent infections?" |
He, who is also Minister of Public Health, Housing and Local Government Councils, said there were 302,243 Sarawakians at risk or recipients who should be more than 60 years old and had the second dose of COVID-19 on April |
Iya ti mega Menteri Pengerai Mensia Mayuh, Pengawa Berumah enggau Kaunsil Kandang Menua madahka, bisi 302,243 rayat Sarawak ti bisi risiko tauka penerima ke patut beumur lebih 60 taun merima tuchuk kedua dos penyungkak kedua COVID-19 berengkah kena 12 April tu tadi. |
What are the basic skills required to be a good programmer? |
TripletLoss with these parameters:{
"distance_metric": "TripletDistanceMetric.EUCLIDEAN",
"triplet_margin": 5
}
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 4fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0380 | 500 | 4.7509 |
| 0.0761 | 1000 | 2.033 |
| 0.1141 | 1500 | 1.5296 |
| 0.1522 | 2000 | 1.3832 |
| 0.1902 | 2500 | 1.3084 |
| 0.2283 | 3000 | 1.3416 |
| 0.2663 | 3500 | 1.3118 |
| 0.3043 | 4000 | 1.3306 |
| 0.3424 | 4500 | 1.2771 |
| 0.3804 | 5000 | 1.2593 |
| 0.4185 | 5500 | 1.2278 |
| 0.4565 | 6000 | 1.207 |
| 0.4946 | 6500 | 1.1735 |
| 0.5326 | 7000 | 1.1842 |
| 0.5706 | 7500 | 1.1501 |
| 0.6087 | 8000 | 1.1562 |
| 0.6467 | 8500 | 1.1422 |
| 0.6848 | 9000 | 1.1229 |
| 0.7228 | 9500 | 1.0865 |
| 0.7609 | 10000 | 1.1094 |
| 0.7989 | 10500 | 1.0848 |
| 0.8369 | 11000 | 1.0957 |
| 0.8750 | 11500 | 1.0564 |
| 0.9130 | 12000 | 1.0688 |
| 0.9511 | 12500 | 0.9947 |
| 0.9891 | 13000 | 1.048 |
| 1.0272 | 13500 | 1.0183 |
| 1.0652 | 14000 | 1.0139 |
| 1.1032 | 14500 | 1.0291 |
| 1.1413 | 15000 | 1.001 |
| 1.1793 | 15500 | 0.9803 |
| 1.2174 | 16000 | 0.9874 |
| 1.2554 | 16500 | 0.9895 |
| 1.2935 | 17000 | 0.9721 |
| 1.3315 | 17500 | 0.9689 |
| 1.3696 | 18000 | 0.9622 |
| 1.4076 | 18500 | 0.9234 |
| 1.4456 | 19000 | 0.9039 |
| 1.4837 | 19500 | 0.9223 |
| 1.5217 | 20000 | 0.9091 |
| 1.5598 | 20500 | 0.9377 |
| 1.5978 | 21000 | 0.9174 |
| 1.6359 | 21500 | 0.9039 |
| 1.6739 | 22000 | 0.9009 |
| 1.7119 | 22500 | 0.8912 |
| 1.7500 | 23000 | 0.9378 |
| 1.7880 | 23500 | 0.9056 |
| 1.8261 | 24000 | 0.8748 |
| 1.8641 | 24500 | 0.8869 |
| 1.9022 | 25000 | 0.8972 |
| 1.9402 | 25500 | 0.8856 |
| 1.9782 | 26000 | 0.87 |
| 2.0163 | 26500 | 0.869 |
| 2.0543 | 27000 | 0.8255 |
| 2.0924 | 27500 | 0.8421 |
| 2.1304 | 28000 | 0.8196 |
| 2.1685 | 28500 | 0.8292 |
| 2.2065 | 29000 | 0.8374 |
| 2.2445 | 29500 | 0.8101 |
| 2.2826 | 30000 | 0.8329 |
| 2.3206 | 30500 | 0.8073 |
| 2.3587 | 31000 | 0.8015 |
| 2.3967 | 31500 | 0.8221 |
| 2.4348 | 32000 | 0.7914 |
| 2.4728 | 32500 | 0.7768 |
| 2.5108 | 33000 | 0.8036 |
| 2.5489 | 33500 | 0.7825 |
| 2.5869 | 34000 | 0.7981 |
| 2.6250 | 34500 | 0.779 |
| 2.6630 | 35000 | 0.7965 |
| 2.7011 | 35500 | 0.783 |
| 2.7391 | 36000 | 0.7748 |
| 2.7771 | 36500 | 0.7962 |
| 2.8152 | 37000 | 0.7782 |
| 2.8532 | 37500 | 0.7611 |
| 2.8913 | 38000 | 0.7877 |
| 2.9293 | 38500 | 0.757 |
| 2.9674 | 39000 | 0.7789 |
| 3.0054 | 39500 | 0.7745 |
| 3.0434 | 40000 | 0.7471 |
| 3.0815 | 40500 | 0.7299 |
| 3.1195 | 41000 | 0.7119 |
| 3.1576 | 41500 | 0.7199 |
| 3.1956 | 42000 | 0.7318 |
| 3.2337 | 42500 | 0.7446 |
| 3.2717 | 43000 | 0.7316 |
| 3.3097 | 43500 | 0.7534 |
| 3.3478 | 44000 | 0.704 |
| 3.3858 | 44500 | 0.7005 |
| 3.4239 | 45000 | 0.713 |
| 3.4619 | 45500 | 0.7492 |
| 3.5000 | 46000 | 0.7337 |
| 3.5380 | 46500 | 0.7025 |
| 3.5760 | 47000 | 0.753 |
| 3.6141 | 47500 | 0.7378 |
| 3.6521 | 48000 | 0.7242 |
| 3.6902 | 48500 | 0.7123 |
| 3.7282 | 49000 | 0.7277 |
| 3.7663 | 49500 | 0.7272 |
| 3.8043 | 50000 | 0.7094 |
| 3.8423 | 50500 | 0.7074 |
| 3.8804 | 51000 | 0.7162 |
| 3.9184 | 51500 | 0.6984 |
| 3.9565 | 52000 | 0.693 |
| 3.9945 | 52500 | 0.7026 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
sentence-transformers/all-MiniLM-L6-v2