Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("along26/all-MiniLM-L6-v2_multilingual_malaysian")
# Run inference
sentences = [
"What is the intensity of light transmitted through two polarizers with their axes at an angle of 45 degrees to each other, if the intensity of the incident light is 12 W/m² and the polarizer absorbs 50% of the light perpendicular to its axis? Use Malus' Law to solve the problem.",
'Apakah keamatan cahaya yang dihantar melalui dua polarizer dengan paksinya pada sudut 45 darjah antara satu sama lain, jika keamatan cahaya kejadian ialah 12 W/m² dan polarizer menyerap 50% cahaya berserenjang dengan paksinya? Gunakan Hukum Malus untuk menyelesaikan masalah.',
'What role did the opposition parties and civil society organizations play in exposing the 1MDB scandal and holding Najib Razak accountable?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.7535, 0.9687],
# [-0.7535, 1.0000, -0.7616],
# [ 0.9687, -0.7616, 1.0000]])
sentence_0, sentence_1, and sentence_2| sentence_0 | sentence_1 | sentence_2 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| sentence_0 | sentence_1 | sentence_2 |
|---|---|---|
What are the four main functions of the human liver, and how is its unique anatomic structure suited to perform these functions? |
Apakah empat fungsi utama hati manusia, dan bagaimanakah struktur anatomi uniknya sesuai untuk melaksanakan fungsi ini? |
Why is the Malaysian government not doing enough to address the rising cost of living and income inequality? |
Changing the temperature affects the equilibrium constant (Kc) and the formation of the Fe(SCN)2+ complex ion from Fe3+ and SCN- ions according to Le Chatelier's principle. Le Chatelier's principle states that if a system at equilibrium is subjected to a change in temperature, pressure, or concentration of reactants or products, the system will adjust its position to counteract the change and re-establish equilibrium. |
Menukar suhu memberi kesan kepada pemalar keseimbangan (Kc) dan pembentukan ion kompleks Fe(SCN)2+ daripada ion Fe3+ dan SCN- mengikut prinsip Le Chatelier. Prinsip Le Chatelier menyatakan bahawa jika sistem pada keseimbangan tertakluk kepada perubahan suhu, tekanan, atau kepekatan bahan tindak balas atau produk, sistem akan menyesuaikan kedudukannya untuk mengatasi perubahan dan mewujudkan semula keseimbangan. |
Why does Malaysia have one of the highest income disparities in the world, with a significant portion of the population living in poverty despite being a middle-income country? |
The use of laws like the Official Secrets Act (OSA) and Sedition Act in Malaysia has been criticized for stifling free speech and discouraging whistleblowing in the country's anti-corruption efforts. |
Penggunaan undang-undang seperti Akta Rahsia Rasmi (OSA) dan Akta Hasutan di Malaysia telah dikritik kerana menyekat kebebasan bersuara dan menghalang pemberi maklumat dalam usaha anti-rasuah negara. |
How do the surface properties of metal catalysts influence the selectivity and activity of the oxidation reaction of hydrocarbons? |
TripletLoss with these parameters:{
"distance_metric": "TripletDistanceMetric.EUCLIDEAN",
"triplet_margin": 5
}
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 10fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0761 | 500 | 4.9612 |
| 0.1522 | 1000 | 0.4336 |
| 0.2282 | 1500 | 0.0468 |
| 0.3043 | 2000 | 0.0045 |
| 0.3804 | 2500 | 0.0033 |
| 0.4565 | 3000 | 0.001 |
| 0.5326 | 3500 | 0.0012 |
| 0.6086 | 4000 | 0.0006 |
| 0.6847 | 4500 | 0.0002 |
| 0.7608 | 5000 | 0.0009 |
| 0.8369 | 5500 | 0.0009 |
| 0.9130 | 6000 | 0.0004 |
| 0.9890 | 6500 | 0.0007 |
| 1.0651 | 7000 | 0.0003 |
| 1.1412 | 7500 | 0.0011 |
| 1.2173 | 8000 | 0.0008 |
| 1.2934 | 8500 | 0.0 |
| 1.3694 | 9000 | 0.0003 |
| 1.4455 | 9500 | 0.0004 |
| 1.5216 | 10000 | 0.0 |
| 1.5977 | 10500 | 0.0003 |
| 1.6738 | 11000 | 0.0 |
| 1.7498 | 11500 | 0.0002 |
| 1.8259 | 12000 | 0.0003 |
| 1.9020 | 12500 | 0.0004 |
| 1.9781 | 13000 | 0.0005 |
| 2.0542 | 13500 | 0.0001 |
| 2.1302 | 14000 | 0.0 |
| 2.2063 | 14500 | 0.0 |
| 2.2824 | 15000 | 0.0 |
| 2.3585 | 15500 | 0.0004 |
| 2.4346 | 16000 | 0.0002 |
| 2.5107 | 16500 | 0.0003 |
| 2.5867 | 17000 | 0.001 |
| 2.6628 | 17500 | 0.0 |
| 2.7389 | 18000 | 0.0005 |
| 2.8150 | 18500 | 0.0003 |
| 2.8911 | 19000 | 0.0 |
| 2.9671 | 19500 | 0.0001 |
| 3.0432 | 20000 | 0.0 |
| 3.1193 | 20500 | 0.0 |
| 3.1954 | 21000 | 0.0003 |
| 3.2715 | 21500 | 0.0 |
| 3.3475 | 22000 | 0.0003 |
| 3.4236 | 22500 | 0.0 |
| 3.4997 | 23000 | 0.0 |
| 3.5758 | 23500 | 0.0 |
| 3.6519 | 24000 | 0.0 |
| 3.7279 | 24500 | 0.0 |
| 3.8040 | 25000 | 0.0003 |
| 3.8801 | 25500 | 0.0003 |
| 3.9562 | 26000 | 0.0 |
| 4.0323 | 26500 | 0.0 |
| 4.1083 | 27000 | 0.0 |
| 4.1844 | 27500 | 0.0 |
| 4.2605 | 28000 | 0.0002 |
| 4.3366 | 28500 | 0.0 |
| 4.4127 | 29000 | 0.0 |
| 4.4887 | 29500 | 0.0003 |
| 4.5648 | 30000 | 0.0 |
| 4.6409 | 30500 | 0.0003 |
| 4.7170 | 31000 | 0.0 |
| 4.7931 | 31500 | 0.0 |
| 4.8691 | 32000 | 0.0005 |
| 4.9452 | 32500 | 0.0 |
| 5.0213 | 33000 | 0.0 |
| 5.0974 | 33500 | 0.0 |
| 5.1735 | 34000 | 0.0 |
| 5.2495 | 34500 | 0.0003 |
| 5.3256 | 35000 | 0.0 |
| 5.4017 | 35500 | 0.0 |
| 5.4778 | 36000 | 0.0 |
| 5.5539 | 36500 | 0.0 |
| 5.6299 | 37000 | 0.0 |
| 5.7060 | 37500 | 0.0 |
| 5.7821 | 38000 | 0.0001 |
| 5.8582 | 38500 | 0.0009 |
| 5.9343 | 39000 | 0.0 |
| 6.0103 | 39500 | 0.0 |
| 6.0864 | 40000 | 0.0 |
| 6.1625 | 40500 | 0.0 |
| 6.2386 | 41000 | 0.0004 |
| 6.3147 | 41500 | 0.0 |
| 6.3907 | 42000 | 0.0 |
| 6.4668 | 42500 | 0.0 |
| 6.5429 | 43000 | 0.0 |
| 6.6190 | 43500 | 0.0003 |
| 6.6951 | 44000 | 0.0 |
| 6.7712 | 44500 | 0.0 |
| 6.8472 | 45000 | 0.0003 |
| 6.9233 | 45500 | 0.0 |
| 6.9994 | 46000 | 0.0 |
| 7.0755 | 46500 | 0.0 |
| 7.1516 | 47000 | 0.0 |
| 7.2276 | 47500 | 0.0 |
| 7.3037 | 48000 | 0.0003 |
| 7.3798 | 48500 | 0.0 |
| 7.4559 | 49000 | 0.0003 |
| 7.5320 | 49500 | 0.0 |
| 7.6080 | 50000 | 0.0003 |
| 7.6841 | 50500 | 0.0 |
| 7.7602 | 51000 | 0.0 |
| 7.8363 | 51500 | 0.0 |
| 7.9124 | 52000 | 0.0 |
| 7.9884 | 52500 | 0.0 |
| 8.0645 | 53000 | 0.0 |
| 8.1406 | 53500 | 0.0003 |
| 8.2167 | 54000 | 0.0 |
| 8.2928 | 54500 | 0.0 |
| 8.3688 | 55000 | 0.0 |
| 8.4449 | 55500 | 0.0 |
| 8.5210 | 56000 | 0.0 |
| 8.5971 | 56500 | 0.0 |
| 8.6732 | 57000 | 0.0 |
| 8.7492 | 57500 | 0.0003 |
| 8.8253 | 58000 | 0.0003 |
| 8.9014 | 58500 | 0.0 |
| 8.9775 | 59000 | 0.0 |
| 9.0536 | 59500 | 0.0 |
| 9.1296 | 60000 | 0.0 |
| 9.2057 | 60500 | 0.0 |
| 9.2818 | 61000 | 0.0 |
| 9.3579 | 61500 | 0.0 |
| 9.4340 | 62000 | 0.0 |
| 9.5100 | 62500 | 0.0 |
| 9.5861 | 63000 | 0.0006 |
| 9.6622 | 63500 | 0.0 |
| 9.7383 | 64000 | 0.0 |
| 9.8144 | 64500 | 0.0 |
| 9.8904 | 65000 | 0.0 |
| 9.9665 | 65500 | 0.0006 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
sentence-transformers/all-MiniLM-L6-v2