Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use codersan/newfa_e5base2 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("codersan/newfa_e5base2")
sentences = [
"نمونه هایی از تئوری های توطئه ها که به نظر می رسد درست است؟",
"آیا نظریه های توطئه ای وجود دارد که احتمالاً صادق است؟نظریه های توطئه ای که معلوم شد درست است؟",
"بازیگران پانتومیم در حال اجرا بر روی صحنه هستند.",
"چرا میل الکترون فلورین کمتر از کلر است ، در حالی که فلورین الکترونگاتیو ترین عنصر است؟"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from intfloat/multilingual-e5-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("codersan/newfa_e5base2")
# Run inference
sentences = [
'مرزهای صفحه چیست؟برخی از انواع چیست؟',
'مرزهای صفحه چیست؟',
'اتانول چند ایزومر دارد؟',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
گاو یونجه می خورد |
گاو در حال چریدن است |
ماشینی به شکلی خطرناک از روی دختری میپرد. |
دختر با بیاحتیاطی روی ماشین میپرد. |
چگونه می توانم کارتهای هدیه iTunes رایگان را در هند دریافت کنم؟ |
چگونه می توانم کارتهای هدیه iTunes رایگان دریافت کنم؟ |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
per_device_train_batch_size: 32learning_rate: 2e-05weight_decay: 0.01batch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss |
|---|---|---|
| 0.0224 | 100 | 0.0821 |
| 0.0448 | 200 | 0.0455 |
| 0.0671 | 300 | 0.0408 |
| 0.0895 | 400 | 0.0461 |
| 0.1119 | 500 | 0.0418 |
| 0.1343 | 600 | 0.0449 |
| 0.1567 | 700 | 0.0314 |
| 0.1791 | 800 | 0.0252 |
| 0.2014 | 900 | 0.0254 |
| 0.2238 | 1000 | 0.0341 |
| 0.2462 | 1100 | 0.0239 |
| 0.2686 | 1200 | 0.0308 |
| 0.2910 | 1300 | 0.0415 |
| 0.3133 | 1400 | 0.0386 |
| 0.3357 | 1500 | 0.027 |
| 0.3581 | 1600 | 0.0369 |
| 0.3805 | 1700 | 0.0346 |
| 0.4029 | 1800 | 0.0301 |
| 0.4252 | 1900 | 0.03 |
| 0.4476 | 2000 | 0.0179 |
| 0.4700 | 2100 | 0.035 |
| 0.4924 | 2200 | 0.0327 |
| 0.5148 | 2300 | 0.033 |
| 0.5372 | 2400 | 0.0272 |
| 0.5595 | 2500 | 0.0318 |
| 0.5819 | 2600 | 0.025 |
| 0.6043 | 2700 | 0.023 |
| 0.6267 | 2800 | 0.0294 |
| 0.6491 | 2900 | 0.0337 |
| 0.6714 | 3000 | 0.0274 |
| 0.6938 | 3100 | 0.0223 |
| 0.7162 | 3200 | 0.0384 |
| 0.7386 | 3300 | 0.0217 |
| 0.7610 | 3400 | 0.032 |
| 0.7833 | 3500 | 0.0309 |
| 0.8057 | 3600 | 0.024 |
| 0.8281 | 3700 | 0.0273 |
| 0.8505 | 3800 | 0.0245 |
| 0.8729 | 3900 | 0.0268 |
| 0.8953 | 4000 | 0.0322 |
| 0.9176 | 4100 | 0.0271 |
| 0.9400 | 4200 | 0.0316 |
| 0.9624 | 4300 | 0.0179 |
| 0.9848 | 4400 | 0.0294 |
| 1.0072 | 4500 | 0.0283 |
| 1.0295 | 4600 | 0.0171 |
| 1.0519 | 4700 | 0.017 |
| 1.0743 | 4800 | 0.0197 |
| 1.0967 | 4900 | 0.0215 |
| 1.1191 | 5000 | 0.02 |
| 1.1415 | 5100 | 0.0144 |
| 1.1638 | 5200 | 0.015 |
| 1.1862 | 5300 | 0.0084 |
| 1.2086 | 5400 | 0.0115 |
| 1.2310 | 5500 | 0.0143 |
| 1.2534 | 5600 | 0.0129 |
| 1.2757 | 5700 | 0.0165 |
| 1.2981 | 5800 | 0.0168 |
| 1.3205 | 5900 | 0.0233 |
| 1.3429 | 6000 | 0.0156 |
| 1.3653 | 6100 | 0.0207 |
| 1.3876 | 6200 | 0.0149 |
| 1.4100 | 6300 | 0.0134 |
| 1.4324 | 6400 | 0.0108 |
| 1.4548 | 6500 | 0.0118 |
| 1.4772 | 6600 | 0.0173 |
| 1.4996 | 6700 | 0.0171 |
| 1.5219 | 6800 | 0.0168 |
| 1.5443 | 6900 | 0.0144 |
| 1.5667 | 7000 | 0.0111 |
| 1.5891 | 7100 | 0.0117 |
| 1.6115 | 7200 | 0.0122 |
| 1.6338 | 7300 | 0.0143 |
| 1.6562 | 7400 | 0.0151 |
| 1.6786 | 7500 | 0.0152 |
| 1.7010 | 7600 | 0.012 |
| 1.7234 | 7700 | 0.0177 |
| 1.7457 | 7800 | 0.0172 |
| 1.7681 | 7900 | 0.016 |
| 1.7905 | 8000 | 0.0141 |
| 1.8129 | 8100 | 0.0112 |
| 1.8353 | 8200 | 0.011 |
| 1.8577 | 8300 | 0.0132 |
| 1.8800 | 8400 | 0.0127 |
| 1.9024 | 8500 | 0.0188 |
| 1.9248 | 8600 | 0.0196 |
| 1.9472 | 8700 | 0.0106 |
| 1.9696 | 8800 | 0.0108 |
| 1.9919 | 8900 | 0.0172 |
| 2.0143 | 9000 | 0.0116 |
| 2.0367 | 9100 | 0.0089 |
| 2.0591 | 9200 | 0.0096 |
| 2.0815 | 9300 | 0.0142 |
| 2.1038 | 9400 | 0.0112 |
| 2.1262 | 9500 | 0.0103 |
| 2.1486 | 9600 | 0.0077 |
| 2.1710 | 9700 | 0.0082 |
| 2.1934 | 9800 | 0.0066 |
| 2.2158 | 9900 | 0.0106 |
| 2.2381 | 10000 | 0.0072 |
| 2.2605 | 10100 | 0.0085 |
| 2.2829 | 10200 | 0.0085 |
| 2.3053 | 10300 | 0.015 |
| 2.3277 | 10400 | 0.0113 |
| 2.3500 | 10500 | 0.0118 |
| 2.3724 | 10600 | 0.0123 |
| 2.3948 | 10700 | 0.0071 |
| 2.4172 | 10800 | 0.0087 |
| 2.4396 | 10900 | 0.0056 |
| 2.4620 | 11000 | 0.0091 |
| 2.4843 | 11100 | 0.0116 |
| 2.5067 | 11200 | 0.0123 |
| 2.5291 | 11300 | 0.0108 |
| 2.5515 | 11400 | 0.0078 |
| 2.5739 | 11500 | 0.0072 |
| 2.5962 | 11600 | 0.0084 |
| 2.6186 | 11700 | 0.0066 |
| 2.6410 | 11800 | 0.0115 |
| 2.6634 | 11900 | 0.0088 |
| 2.6858 | 12000 | 0.008 |
| 2.7081 | 12100 | 0.0095 |
| 2.7305 | 12200 | 0.0108 |
| 2.7529 | 12300 | 0.0113 |
| 2.7753 | 12400 | 0.0086 |
| 2.7977 | 12500 | 0.0096 |
| 2.8201 | 12600 | 0.0093 |
| 2.8424 | 12700 | 0.0076 |
| 2.8648 | 12800 | 0.006 |
| 2.8872 | 12900 | 0.0124 |
| 2.9096 | 13000 | 0.0131 |
| 2.9320 | 13100 | 0.0103 |
| 2.9543 | 13200 | 0.0063 |
| 2.9767 | 13300 | 0.0067 |
| 2.9991 | 13400 | 0.0117 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
intfloat/multilingual-e5-base