Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
This is a sentence-transformers model finetuned from google-bert/bert-base-cased on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Jimmy-Ooi/TTM_1000_12_10_0.0001_AdamW")
# Run inference
sentences = [
'Nc1ccc(C(=O)N2CCN(Cc3ccc(F)cc3)CC2)c(N)c1',
'CCCCC(=O)NC(=S)Nc1ccc([N+](=O)[O-])cc1',
'Cc1ccc(C(=O)N2CCN(Cc3ccc(F)cc3)CC2)cc1',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.7396, -0.6053],
# [ 0.7396, 1.0000, 0.0397],
# [-0.6053, 0.0397, 1.0000]])
premise, hypothesis, and label| premise | hypothesis | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
SoftmaxLosspremise, hypothesis, and label| premise | hypothesis | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| premise | hypothesis | label |
|---|---|---|
NC(=S)N/N=C/c1ccccc1Cl |
CN(C)c1ccc(C2CC(c3cc4ccccc4o3)=NN2C(=O)Nc2ccc(Cl)cc2)c2ccccc12 |
2 |
CC(C)C@HC(=O)NCc1cc(=O)c(O)c[nH]1 |
COc1cc([C@H]2Oc3ccc([C@H]4Oc5cc(O)cc(O)c5C(=O)[C@@H]4O)cc3O[C@@H]2CO)ccc1O |
0 |
Cc1ccc(S(=O)(=O)Oc2ccccc2/N=N/c2ccc(O)cc2O)cc1 |
CC[C@]1(O)CC[C@H]2[C@@H]3CCC4=CC(=O)C=C[C@]4(C)[C@H]3CC[C@@]21C |
2 |
SoftmaxLossper_device_train_batch_size: 64per_device_eval_batch_size: 64weight_decay: 0.001num_train_epochs: 10warmup_steps: 100fp16: Trueoptim: adamw_torchdo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.001adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 100log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Truebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0340 | 100 | 1.0284 |
| 0.0680 | 200 | 0.6905 |
| 0.1020 | 300 | 0.6582 |
| 0.1360 | 400 | 0.6328 |
| 0.1700 | 500 | 0.6171 |
| 0.2039 | 600 | 0.6074 |
| 0.2379 | 700 | 0.5843 |
| 0.2719 | 800 | 0.5956 |
| 0.3059 | 900 | 0.5776 |
| 0.3399 | 1000 | 0.5919 |
| 0.3739 | 1100 | 0.5756 |
| 0.4079 | 1200 | 0.5730 |
| 0.4419 | 1300 | 0.5725 |
| 0.4759 | 1400 | 0.5568 |
| 0.5099 | 1500 | 0.5684 |
| 0.5438 | 1600 | 0.5621 |
| 0.5778 | 1700 | 0.5623 |
| 0.6118 | 1800 | 0.5586 |
| 0.6458 | 1900 | 0.5585 |
| 0.6798 | 2000 | 0.5539 |
| 0.7138 | 2100 | 0.5485 |
| 0.7478 | 2200 | 0.5442 |
| 0.7818 | 2300 | 0.5419 |
| 0.8158 | 2400 | 0.5488 |
| 0.8498 | 2500 | 0.5369 |
| 0.8838 | 2600 | 0.5417 |
| 0.9177 | 2700 | 0.5364 |
| 0.9517 | 2800 | 0.5493 |
| 0.9857 | 2900 | 0.5419 |
| 1.0197 | 3000 | 0.5480 |
| 1.0537 | 3100 | 0.5403 |
| 1.0877 | 3200 | 0.5336 |
| 1.1217 | 3300 | 0.5397 |
| 1.1557 | 3400 | 0.5225 |
| 1.1897 | 3500 | 0.5296 |
| 1.2237 | 3600 | 0.5351 |
| 1.2576 | 3700 | 0.5361 |
| 1.2916 | 3800 | 0.5366 |
| 1.3256 | 3900 | 0.5359 |
| 1.3596 | 4000 | 0.5479 |
| 1.3936 | 4100 | 0.5280 |
| 1.4276 | 4200 | 0.5338 |
| 1.4616 | 4300 | 0.5297 |
| 1.4956 | 4400 | 0.5404 |
| 1.5296 | 4500 | 0.5355 |
| 1.5636 | 4600 | 0.5296 |
| 1.5976 | 4700 | 0.5333 |
| 1.6315 | 4800 | 0.5217 |
| 1.6655 | 4900 | 0.5243 |
| 1.6995 | 5000 | 0.5425 |
| 1.7335 | 5100 | 0.5275 |
| 1.7675 | 5200 | 0.5354 |
| 1.8015 | 5300 | 0.5284 |
| 1.8355 | 5400 | 0.5405 |
| 1.8695 | 5500 | 0.5246 |
| 1.9035 | 5600 | 0.5291 |
| 1.9375 | 5700 | 0.5209 |
| 1.9714 | 5800 | 0.5210 |
| 2.0054 | 5900 | 0.5248 |
| 2.0394 | 6000 | 0.5290 |
| 2.0734 | 6100 | 0.5166 |
| 2.1074 | 6200 | 0.5171 |
| 2.1414 | 6300 | 0.5165 |
| 2.1754 | 6400 | 0.5165 |
| 2.2094 | 6500 | 0.5144 |
| 2.2434 | 6600 | 0.5211 |
| 2.2774 | 6700 | 0.5147 |
| 2.3114 | 6800 | 0.5173 |
| 2.3453 | 6900 | 0.5201 |
| 2.3793 | 7000 | 0.5246 |
| 2.4133 | 7100 | 0.5112 |
| 2.4473 | 7200 | 0.5287 |
| 2.4813 | 7300 | 0.5105 |
| 2.5153 | 7400 | 0.5233 |
| 2.5493 | 7500 | 0.5214 |
| 2.5833 | 7600 | 0.5240 |
| 2.6173 | 7700 | 0.5208 |
| 2.6513 | 7800 | 0.5241 |
| 2.6852 | 7900 | 0.5206 |
| 2.7192 | 8000 | 0.5142 |
| 2.7532 | 8100 | 0.5203 |
| 2.7872 | 8200 | 0.5168 |
| 2.8212 | 8300 | 0.5169 |
| 2.8552 | 8400 | 0.5168 |
| 2.8892 | 8500 | 0.5159 |
| 2.9232 | 8600 | 0.5053 |
| 2.9572 | 8700 | 0.5190 |
| 2.9912 | 8800 | 0.5199 |
| 3.0252 | 8900 | 0.5033 |
| 3.0591 | 9000 | 0.5092 |
| 3.0931 | 9100 | 0.4983 |
| 3.1271 | 9200 | 0.5166 |
| 3.1611 | 9300 | 0.5157 |
| 3.1951 | 9400 | 0.5129 |
| 3.2291 | 9500 | 0.5060 |
| 3.2631 | 9600 | 0.5127 |
| 3.2971 | 9700 | 0.5082 |
| 3.3311 | 9800 | 0.5073 |
| 3.3651 | 9900 | 0.5092 |
| 3.3990 | 10000 | 0.5149 |
| 3.4330 | 10100 | 0.5049 |
| 3.4670 | 10200 | 0.5181 |
| 3.5010 | 10300 | 0.5031 |
| 3.5350 | 10400 | 0.5078 |
| 3.5690 | 10500 | 0.5178 |
| 3.6030 | 10600 | 0.5053 |
| 3.6370 | 10700 | 0.5135 |
| 3.6710 | 10800 | 0.5096 |
| 3.7050 | 10900 | 0.5143 |
| 3.7390 | 11000 | 0.5016 |
| 3.7729 | 11100 | 0.5138 |
| 3.8069 | 11200 | 0.5081 |
| 3.8409 | 11300 | 0.5022 |
| 3.8749 | 11400 | 0.5075 |
| 3.9089 | 11500 | 0.5092 |
| 3.9429 | 11600 | 0.5092 |
| 3.9769 | 11700 | 0.5142 |
| 4.0109 | 11800 | 0.5074 |
| 4.0449 | 11900 | 0.5104 |
| 4.0789 | 12000 | 0.5007 |
| 4.1128 | 12100 | 0.5057 |
| 4.1468 | 12200 | 0.5093 |
| 4.1808 | 12300 | 0.4931 |
| 4.2148 | 12400 | 0.5054 |
| 4.2488 | 12500 | 0.4965 |
| 4.2828 | 12600 | 0.5045 |
| 4.3168 | 12700 | 0.5095 |
| 4.3508 | 12800 | 0.5076 |
| 4.3848 | 12900 | 0.5053 |
| 4.4188 | 13000 | 0.5053 |
| 4.4528 | 13100 | 0.5019 |
| 4.4867 | 13200 | 0.5062 |
| 4.5207 | 13300 | 0.5033 |
| 4.5547 | 13400 | 0.5100 |
| 4.5887 | 13500 | 0.5000 |
| 4.6227 | 13600 | 0.5006 |
| 4.6567 | 13700 | 0.5106 |
| 4.6907 | 13800 | 0.5018 |
| 4.7247 | 13900 | 0.5045 |
| 4.7587 | 14000 | 0.5095 |
| 4.7927 | 14100 | 0.5010 |
| 4.8266 | 14200 | 0.4983 |
| 4.8606 | 14300 | 0.5008 |
| 4.8946 | 14400 | 0.5031 |
| 4.9286 | 14500 | 0.5064 |
| 4.9626 | 14600 | 0.5058 |
| 4.9966 | 14700 | 0.5019 |
| 5.0306 | 14800 | 0.5015 |
| 5.0646 | 14900 | 0.5063 |
| 5.0986 | 15000 | 0.5041 |
| 5.1326 | 15100 | 0.4993 |
| 5.1666 | 15200 | 0.5007 |
| 5.2005 | 15300 | 0.5030 |
| 5.2345 | 15400 | 0.5069 |
| 5.2685 | 15500 | 0.4998 |
| 5.3025 | 15600 | 0.5117 |
| 5.3365 | 15700 | 0.4992 |
| 5.3705 | 15800 | 0.5061 |
| 5.4045 | 15900 | 0.5034 |
| 5.4385 | 16000 | 0.5080 |
| 5.4725 | 16100 | 0.4992 |
| 5.5065 | 16200 | 0.5021 |
| 5.5404 | 16300 | 0.4950 |
| 5.5744 | 16400 | 0.4974 |
| 5.6084 | 16500 | 0.5040 |
| 5.6424 | 16600 | 0.4953 |
| 5.6764 | 16700 | 0.5058 |
| 5.7104 | 16800 | 0.4965 |
| 5.7444 | 16900 | 0.5035 |
| 5.7784 | 17000 | 0.5002 |
| 5.8124 | 17100 | 0.4952 |
| 5.8464 | 17200 | 0.4986 |
| 5.8804 | 17300 | 0.4951 |
| 5.9143 | 17400 | 0.4929 |
| 5.9483 | 17500 | 0.4989 |
| 5.9823 | 17600 | 0.4993 |
| 6.0163 | 17700 | 0.4936 |
| 6.0503 | 17800 | 0.5018 |
| 6.0843 | 17900 | 0.5017 |
| 6.1183 | 18000 | 0.4989 |
| 6.1523 | 18100 | 0.4924 |
| 6.1863 | 18200 | 0.4948 |
| 6.2203 | 18300 | 0.5039 |
| 6.2542 | 18400 | 0.5000 |
| 6.2882 | 18500 | 0.4964 |
| 6.3222 | 18600 | 0.4981 |
| 6.3562 | 18700 | 0.4995 |
| 6.3902 | 18800 | 0.4942 |
| 6.4242 | 18900 | 0.5018 |
| 6.4582 | 19000 | 0.4903 |
| 6.4922 | 19100 | 0.4867 |
| 6.5262 | 19200 | 0.5014 |
| 6.5602 | 19300 | 0.4963 |
| 6.5942 | 19400 | 0.4971 |
| 6.6281 | 19500 | 0.4957 |
| 6.6621 | 19600 | 0.4990 |
| 6.6961 | 19700 | 0.4994 |
| 6.7301 | 19800 | 0.4995 |
| 6.7641 | 19900 | 0.4978 |
| 6.7981 | 20000 | 0.5013 |
| 6.8321 | 20100 | 0.4970 |
| 6.8661 | 20200 | 0.4938 |
| 6.9001 | 20300 | 0.4954 |
| 6.9341 | 20400 | 0.4910 |
| 6.9680 | 20500 | 0.5043 |
| 7.0020 | 20600 | 0.4965 |
| 7.0360 | 20700 | 0.5033 |
| 7.0700 | 20800 | 0.5012 |
| 7.1040 | 20900 | 0.4954 |
| 7.1380 | 21000 | 0.4977 |
| 7.1720 | 21100 | 0.4882 |
| 7.2060 | 21200 | 0.4957 |
| 7.2400 | 21300 | 0.5027 |
| 7.2740 | 21400 | 0.4963 |
| 7.3080 | 21500 | 0.4941 |
| 7.3419 | 21600 | 0.4990 |
| 7.3759 | 21700 | 0.4936 |
| 7.4099 | 21800 | 0.4992 |
| 7.4439 | 21900 | 0.4987 |
| 7.4779 | 22000 | 0.4924 |
| 7.5119 | 22100 | 0.4956 |
| 7.5459 | 22200 | 0.4881 |
| 7.5799 | 22300 | 0.4873 |
| 7.6139 | 22400 | 0.4910 |
| 7.6479 | 22500 | 0.4934 |
| 7.6818 | 22600 | 0.4935 |
| 7.7158 | 22700 | 0.4904 |
| 7.7498 | 22800 | 0.4946 |
| 7.7838 | 22900 | 0.4926 |
| 7.8178 | 23000 | 0.4934 |
| 7.8518 | 23100 | 0.5024 |
| 7.8858 | 23200 | 0.4925 |
| 7.9198 | 23300 | 0.4964 |
| 7.9538 | 23400 | 0.4941 |
| 7.9878 | 23500 | 0.4993 |
| 8.0218 | 23600 | 0.4933 |
| 8.0557 | 23700 | 0.4959 |
| 8.0897 | 23800 | 0.4913 |
| 8.1237 | 23900 | 0.5027 |
| 8.1577 | 24000 | 0.4911 |
| 8.1917 | 24100 | 0.4930 |
| 8.2257 | 24200 | 0.4963 |
| 8.2597 | 24300 | 0.4928 |
| 8.2937 | 24400 | 0.4924 |
| 8.3277 | 24500 | 0.4995 |
| 8.3617 | 24600 | 0.4957 |
| 8.3956 | 24700 | 0.4871 |
| 8.4296 | 24800 | 0.4894 |
| 8.4636 | 24900 | 0.4978 |
| 8.4976 | 25000 | 0.4936 |
| 8.5316 | 25100 | 0.4898 |
| 8.5656 | 25200 | 0.4927 |
| 8.5996 | 25300 | 0.5016 |
| 8.6336 | 25400 | 0.4883 |
| 8.6676 | 25500 | 0.5005 |
| 8.7016 | 25600 | 0.4898 |
| 8.7356 | 25700 | 0.4923 |
| 8.7695 | 25800 | 0.4973 |
| 8.8035 | 25900 | 0.4881 |
| 8.8375 | 26000 | 0.4898 |
| 8.8715 | 26100 | 0.4888 |
| 8.9055 | 26200 | 0.4908 |
| 8.9395 | 26300 | 0.4960 |
| 8.9735 | 26400 | 0.4918 |
| 9.0075 | 26500 | 0.4892 |
| 9.0415 | 26600 | 0.4951 |
| 9.0755 | 26700 | 0.4905 |
| 9.1094 | 26800 | 0.4882 |
| 9.1434 | 26900 | 0.4857 |
| 9.1774 | 27000 | 0.4972 |
| 9.2114 | 27100 | 0.4915 |
| 9.2454 | 27200 | 0.4907 |
| 9.2794 | 27300 | 0.4932 |
| 9.3134 | 27400 | 0.4953 |
| 9.3474 | 27500 | 0.4848 |
| 9.3814 | 27600 | 0.4836 |
| 9.4154 | 27700 | 0.4903 |
| 9.4494 | 27800 | 0.4988 |
| 9.4833 | 27900 | 0.4850 |
| 9.5173 | 28000 | 0.4924 |
| 9.5513 | 28100 | 0.4953 |
| 9.5853 | 28200 | 0.5011 |
| 9.6193 | 28300 | 0.4921 |
| 9.6533 | 28400 | 0.4914 |
| 9.6873 | 28500 | 0.4879 |
| 9.7213 | 28600 | 0.4904 |
| 9.7553 | 28700 | 0.4910 |
| 9.7893 | 28800 | 0.5001 |
| 9.8232 | 28900 | 0.4934 |
| 9.8572 | 29000 | 0.4837 |
| 9.8912 | 29100 | 0.5025 |
| 9.9252 | 29200 | 0.4973 |
| 9.9592 | 29300 | 0.4856 |
| 9.9932 | 29400 | 0.4999 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
google-bert/bert-base-cased