Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use LamaDiab/FinetunningMiniLM-V18Data-256ConstantBATCH-SemanticEngine with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("LamaDiab/FinetunningMiniLM-V18Data-256ConstantBATCH-SemanticEngine")
sentences = [
"gerber baby food fruits apples bananas & cereal",
"world of sweets puzzle",
"baby food",
"baby food"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from LamaDiab/v4MiniLM-V18Data-256ConstantBATCH-SemanticEngine. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("LamaDiab/FinetunningMiniLM-V18Data-256ConstantBATCH-SemanticEngine")
# Run inference
sentences = [
'italian dolce provolone',
'experience the authentic taste of italy with our italian dolce provolone. indulge in its creamy texture, delicate flavors, and versatility in both simple and sophisticated culinary creations.',
'trident - gum strawberry flavor - 5 per pack',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8072, 0.1772],
# [0.8072, 1.0000, 0.2218],
# [0.1772, 0.2218, 1.0000]])
TripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.9706 |
anchor, positive, and itemCategory| anchor | positive | itemCategory | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | itemCategory |
|---|---|---|
mango nos nos small |
milk chocolate ganache cake |
sweet |
lux soap creamy perfection 165 gm |
soap |
hand soap |
grey deo original |
classic deodrant |
women's deodorant |
MultipleNegativesSymmetricRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
anchor, positive, negative, and itemCategory| anchor | positive | negative | itemCategory | |
|---|---|---|---|---|
| type | string | string | string | string |
| details |
|
|
|
|
| anchor | positive | negative | itemCategory |
|---|---|---|---|
pilot mechanical pencil progrex h-127 - 0.7 mm |
office supplies |
scary halloween skull mask |
pencil |
superior drawing marker -pen - set of 12 colors - 2 nib |
superior |
coloring and writing book 21 x 29.7 cm 100 gsm 18 pages number subtraction ma4014 |
marker |
first person singular author: haruki murakami |
haruki murakami book |
buried secrets |
literature and fiction |
MultipleNegativesSymmetricRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
eval_strategy: stepsper_device_train_batch_size: 256per_device_eval_batch_size: 256learning_rate: 2e-05weight_decay: 0.001num_train_epochs: 4warmup_ratio: 0.2fp16: Truedataloader_num_workers: 1dataloader_prefetch_factor: 2dataloader_persistent_workers: Truepush_to_hub: Truehub_model_id: LamaDiab/FinetunningMiniLM-V18Data-256ConstantBATCH-SemanticEnginehub_strategy: all_checkpointsoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 256per_device_eval_batch_size: 256per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.001adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.2warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 1dataloader_prefetch_factor: 2past_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Trueskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Trueresume_from_checkpoint: Nonehub_model_id: LamaDiab/FinetunningMiniLM-V18Data-256ConstantBATCH-SemanticEnginehub_strategy: all_checkpointshub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss | cosine_accuracy |
|---|---|---|---|---|
| 0.0004 | 1 | 1.2042 | - | - |
| 0.3626 | 1000 | 1.1885 | 0.3903 | 0.9712 |
| 0.7252 | 2000 | 0.8207 | 0.3788 | 0.9699 |
| 1.0877 | 3000 | 0.7257 | 0.3854 | 0.9716 |
| 1.45 | 4000 | 0.9126 | 0.3814 | 0.9709 |
| 1.8123 | 5000 | 0.8911 | 0.3810 | 0.9711 |
| 2.1746 | 6000 | 0.8678 | 0.3830 | 0.9716 |
| 2.5370 | 7000 | 0.8551 | 0.3813 | 0.9721 |
| 2.8993 | 8000 | 0.8576 | 0.3842 | 0.9715 |
| 3.2616 | 9000 | 0.8325 | 0.3848 | 0.9713 |
| 3.6239 | 10000 | 0.8453 | 0.3842 | 0.9710 |
| 3.9862 | 11000 | 0.817 | 0.3848 | 0.9706 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
sentence-transformers/all-MiniLM-L6-v2
from sentence_transformers import SentenceTransformer model = SentenceTransformer("LamaDiab/FinetunningMiniLM-V18Data-256ConstantBATCH-SemanticEngine") sentences = [ "gerber baby food fruits apples bananas & cereal", "world of sweets puzzle", "baby food", "baby food" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4]