metadata
base_model: sentence-transformers/all-mpnet-base-v2
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:281362
- loss:CachedMultipleNegativesRankingLoss
widget:
- source_sentence: >-
steel lock washer 8 x 144 2 mm zinc plated split 2004 bmw 325ci base
convertible miscellaneous hardware page 3 auveco 17397m769 automotive
sentences:
- >-
steel lock washer 8 x 144 2 mm zinc plated split 1993 bmw 318i base
sedan miscellaneous hardware page 3 auveco 17397m769 automotive
- >-
generac protector rg03624ansx standby generators liquidcooled reviews
ratings product discontinued discontinued electric directcom
696471617450 toolsandhomeimprovement
- >-
drive belt tensioner water pumpalternator 1994 bmw 325i base convertible
charging system battery page 6 note shock type hydraulic ina
11281717188m40 automotive
- source_sentence: >-
nokya hyper white front turn signal light bulbs 2010 toyota camry please
double check your bulbs to make sure we have the right replacement bulb
listed so there is arguably even an added benefit of increased safety
tooplease note then you almost have change corner lights too avoid ruining
benefits new headlights give look car also signal in style with these
nokya hyper white front turn signal bulbs instead ugly stock orange 1010
camry came while try be as accurate possible our listings custom front
definitely stand out more compared other could whole assembly a set in
case where are already changing headlight color nok52022pcs automotive
sentences:
- >-
datalogic accessories for readers codbc9180433 datalogic codstdp090
datalogic base stationcharger ethernet datalogic bc9180433
computersandaccessories
- >-
nokya hyper white front turn signal light bulbs 2010 toyota camry please
double check your bulbs to make sure we have the right replacement bulb
listed so there is arguably even an added benefit of increased safety
tooplease note then you almost have change corner lights too avoid
ruining benefits new headlights give look car also signal in style with
these nokya hyper white front turn signal bulbs instead ugly stock
orange 1010 camry came while try be as accurate possible our listings
custom front definitely stand out more compared other could whole
assembly a set in case where are already changing headlight color
nok52022pcs automotive
- >-
39400001 axor citterio wall mounted bath tub filler faucetnohtin
39034821 bathroom faucet tall and handle brushed sale appliance specials
and replacement parts axor citterio revives the opulence of water and
redefines the purity of space each arch angle and line weds clarity and
harmony evoking timeless classics that are mysterious yet somehow
familiar discover a new form of luxury with axor citterio axor
232848id39400001 toolsandhomeimprovement
- source_sentence: >-
canon pixus 865r cartridges for ink jet printers quillcom null
901tgbci6bkclo officeproducts
sentences:
- >-
smart racing products smartcamber digital camber gauge 2003 bmw 325ci
base convertible suspension upgrades performance page 7 pel1850070smrt
automotive valving option street comfort front spring 180mm 8kg rear
spring 135mm 10kg front pillowball pillowball w camber plates rear
pillowball n1 basic w top plates no camber plates valving option street
sport front spring 180mm 8kg rear spring 135mm 10kg front pillowball
pillowball w camber plates rear pillowball basic w top plates no camber
plates valving option track race front spring 180mm 10kg rear spring
140mm 10kg front pillowball pillowball w camber plates rear pillowball
basic w top plates no camber plates
- >-
datalogic cable for readers cod90a051903 datalogic cod90a051330
datalogic cable cab413 usb straight ibm pos mode datalogic 90a051903
computersandaccessories
- >-
canon pixus 865r cartridges for ink jet printers quillcom null
901tgbci6bkclo officeproducts
- source_sentence: >-
headlamp restoration kit sonax 2002 bmw 325i base wagon lights and lenses
page 7 note removes yellowing and haze of plastic headlight lenses
restoring likenew clarity one kit restores four headlights simple three
step process requires no polishing machine step one use the circular
sanding pad to gently remove stubborn headlight hazing step two use the
abrasive polish and application pad to gently remove sanding marks step
three use the towelette to apply a uv protective coating to maintain
headlight clarity contains qty 1 75 ml polish 4 sanding discs 5000 grit 2
application sponges 4 polishing cloths 2 moist cloths with sealant sonax
405941m941 automotive
sentences:
- >-
headlamp restoration kit sonax 1976 bmw 30si base sedan lights and
lenses page 2 note removes yellowing and haze of plastic headlight
lenses restoring likenew clarity one kit restores four headlights simple
three step process requires no polishing machine step one use the
circular sanding pad to gently remove stubborn headlight hazing step two
use the abrasive polish and application pad to gently remove sanding
marks step three use the towelette to apply a uv protective coating to
maintain headlight clarity contains qty 1 75 ml polish 4 sanding discs
5000 grit 2 application sponges 4 polishing cloths 2 moist cloths with
sealant sonax 405941m941 automotive
- >-
philips ultinon led lighting 2122w 43mm festoon white 1 piece 1996 bmw
318i base convertible lights and lenses page 3 phi2122ulwx11 automotive
- >-
canon pixma mx850 cartridges for ink jet printers quillcom trust genuine
canon cli8bk ink cartridges to provide outstanding print quality for all
your important photos and documentsunlike bargain replacement inks
original canon cli8bk ink cartridges are designed specifically to work
with canon printers for exceptional reliability and performancehave full
photolithography inkjet nozzle engineering 901cli8bk officeproducts
- source_sentence: >-
phone cable flat 4 wire solid silver 1000ft 26awg wire solid 1000ft phone
cable flat 4 wire solid silver 1000ft 26awg allows you to connect your
telephones faxes answering machines and most modems perfect for all your
custom installation projects 1000ft roll bulk phone cable flat cable
silver color 4 conductor 26 awg solid copper ul listed 815239013642
otherelectronics
sentences:
- >-
phone cable flat 4 wire solid silver 1000ft 26awg wire solid 1000ft
phone cable flat 4 wire solid silver 1000ft 26awg allows you to connect
your telephones faxes answering machines and most modems perfect for all
your custom installation projects 1000ft roll bulk phone cable flat
cable silver color 4 conductor 26 awg solid copper ul listed
815239013642 otherelectronics
- >-
soul black gb 2013 audi a4 allroad quattro canada market body middle
armrest front pr6e3gb fz period 1111 gb 8k0864207jtq8 automotive
- >-
flashlight streamlight stinger led 1970 bmw 1602 base coupe tools page 8
note compact and extremely powerful with 3 microprocessor controlled
intensity modes strobe mode and the latest in power led technology 6000
series machined aircraft aluminum with nonslip rubberized comfort grip
with antiroll rubber ring unbreakable polycarbonate lens with
scratchresistant coating oring sealed c4 led technology impervious to
shock with a 50000 hour lifetime includes qty 2 3cell 36 volt nicd subc
battery rechargeable upto 1000 times 1 piggy back chargerholder 1 120v
ac charge cord 1 12v dc charge cord 841 inch length 162 inch major
diameter 117 inch body diameter light output 350 lumens on high 175
lumens on medium 85 lumens on low streamlight blue 552480010m1272
toolsandhomeimprovement
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'phone cable flat 4 wire solid silver 1000ft 26awg wire solid 1000ft phone cable flat 4 wire solid silver 1000ft 26awg allows you to connect your telephones faxes answering machines and most modems perfect for all your custom installation projects 1000ft roll bulk phone cable flat cable silver color 4 conductor 26 awg solid copper ul listed 815239013642 otherelectronics',
'phone cable flat 4 wire solid silver 1000ft 26awg wire solid 1000ft phone cable flat 4 wire solid silver 1000ft 26awg allows you to connect your telephones faxes answering machines and most modems perfect for all your custom installation projects 1000ft roll bulk phone cable flat cable silver color 4 conductor 26 awg solid copper ul listed 815239013642 otherelectronics',
'soul black gb 2013 audi a4 allroad quattro canada market body middle armrest front pr6e3gb fz period 1111 gb 8k0864207jtq8 automotive',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 281,362 training samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 14 tokens
- mean: 77.68 tokens
- max: 384 tokens
- min: 20 tokens
- mean: 79.97 tokens
- max: 384 tokens
- Samples:
anchor positive glue tamiya cement 40ml 12 johnn johnny herbert gb shunko models marking livery 120 scale lotus ford type 102d camel 11 tam20033 and tam20034 ref shkd310 decals markings f1 cars 90 years spotmodel derek warwick japan grand prix 1992 water slide decals assembly instructions for references tam20030 tamiya tam87003 automotiveglue tamiya cement 40ml shunko models marking livery 120 scale benetton ford b192 camel 19 20 michael schumacher de martin brundle gb fia formula 1 world championship 1992 water slide decals and assembly instructions for reference tam20036 ref shkd281 decals markings f1 cars 90 years spotmodel tamiya tam87003 automotivehose clamp 29325 mm range 12 width spring type 1995 bmw 325i base sedan radiators page 3 mubea sc2932512m219 automotivehose clamp 29325 mm range 12 width spring type bmw 7series e65 20022008 cooling system miscellaneous page 1 mubea sc2932512m219 automotive part 07129952131boe more info 760i 200406 760li 200308 part 11151726339m395 more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 16121180240m395 more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 16121180240boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 16121180242boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 32411156956m395 more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 32411156956boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 32411712735boe more info 745i and 745li 200205 760i 200406 760li 200308 alpina b7 200708 part 32416751127m9 more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 64218367179boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 07129952102boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 07129952123boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 12511309471boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 16121176918boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308 alpina b7 200708 part 11631716970boe more info 745i and 745li 200205 750i and 750li 200608 760i 200406 760li 200308serial rj45 interlocking cable codak17463008 zebra europe qlrwp4t series lithium ion fast charger codat187373 zebra serial rj45 interlocking cable zebra ak17463008 computersandaccessorieszebra universal accessories other by totalbarcodecom zebra ak17463008 kit mod plug to 9pin db pc cable ak17463008 computersandaccessories - Loss:
CachedMultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 70,341 evaluation samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 21 tokens
- mean: 83.4 tokens
- max: 384 tokens
- min: 19 tokens
- mean: 83.0 tokens
- max: 384 tokens
- Samples:
anchor positive coolant antifreeze blue 1 liter 1996 bmw 318is base coupe radiators page 1 note approved for all bmw and mini engines concentrate for distilled water see part 55 7864 010 fuchs maintain fricofin 82142209769m865 automotivecoolant antifreeze blue 1 liter 1996 bmw 318is base coupe radiators page 1 note approved for all bmw and mini engines concentrate for distilled water see part 55 7864 010 genuine bmw 82142209769m9 automotivesealing compound loctite rtv 5699 gray silicone gasket maker 80 ml tube and supplies page 2 1991 bmw 318i base convertible engine rebuilding kits tools note high performance and noncorrosive designed for high torque applications loctite 37464m258 automotivesealing compound loctite rtv 5699 gray silicone gasket maker 80 ml tube and supplies page 2 1991 bmw 318i base convertible engine rebuilding kits tools note high performance and noncorrosive designed for high torque applications loctite 37464m258 automotivelexmark remanufactured 18c2090 14 black ink cartridge lexmark x2630 cartridges 4inkjets remanlx14 officeproductsremanufactured lexmark inkjet cartridge 18c2090 14 black ink lexmark z2320 ink cartridges and printer supplies inkcartridges remanlx14 officeproducts - Loss:
CachedMultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepslearning_rate: 1e-05num_train_epochs: 2warmup_ratio: 0.1fp16: Trueauto_find_batch_size: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Truefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.1990 | 7000 | 0.0113 | 0.0031 |
| 0.3981 | 14000 | 0.0022 | 0.0019 |
| 0.5971 | 21000 | 0.0019 | 0.0012 |
| 0.7961 | 28000 | 0.0017 | 0.0012 |
| 0.9951 | 35000 | 0.0013 | 0.0011 |
| 1.1942 | 42000 | 0.0012 | 0.0008 |
| 1.3932 | 49000 | 0.0005 | 0.0008 |
Framework Versions
- Python: 3.10.13
- Sentence Transformers: 3.0.1
- Transformers: 4.44.0
- PyTorch: 2.2.1
- Accelerate: 0.33.0
- Datasets: 2.21.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CachedMultipleNegativesRankingLoss
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}