metadata
base_model: Alibaba-NLP/gte-large-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:269761
- loss:CachedMultipleNegativesRankingLoss
widget:
- source_sentence: >-
netgear ac1900 nighthawk smart wifi router netgear wireless broadband
routers cdwcom the netgear ac1900 r7000 nighthawk smart wifi router is
specially designed for gaming streaming and mobile devices with speeds up
to 1900 mbps and a 1 ghz dualcore processor this next generation wireless
router offers extreme speed with reduced lag and less buffering this
internet router comes with advanced features such as netgear genie remote
access readycloud openvpn and kwilt app support so you can manage your
network access a secure personal cloud access home network remotely and
share photos stored on the storage from anywherewifi router with 600 1300
mbps speeds for online gaming streaming and more1 ghz dualcore processor
and prioritized bandwidth for streaming videosreadycloud usb access for
secure cloud access to usb storage at anytimemanage home network and
provide guest access remotely using netgear genie computersandaccessories
sentences:
- >-
unirex s2 grease 40g tube bearing note recommended for high temperature
service in rolling bearings 1995 bmw 325i base convertible axles
bearings differential page 8 note recommended for high temperature
service in rolling bearings genuine bmw automotive
- >-
netgear nighthawk ac1900 dual band wifi gigabit router r7000 with open
source support compatible amazon echoalexa us netgear compatibleus
accelerate your wifi with net gear nighthawk enjoy the fastest wifi
currently available with speeds up to 1900 mbps and a powerful dual core
1ghz processor for extreme performance highpowered amplifiers external
antennas and beamforming improve range and reliability for up to 100
more wireless coverage features like dynamic qos prioritize streaming
and gaming creating a blazingfast lagfree wifi experience r7000 provides
an extensible design that enables service prioritization for data design
that delivers high availability scalability and for maximum flexibility
and priceperformance us netgearus computersandaccessories manufacturer
netgear brand netgear color black model upc 606449099812 item weight 345
pounds item size 1008 x 311 x 311 inches package weight 344 pounds
package size 1047 x 331 x 331 inches units in package 1
- >-
pads high performance ebc pads pads ebc notes rear set of 4 performance
pads ebc greenstuff price per set length mm 108 height mm 44 automotive
- source_sentence: 12v drill impact driver twin pack gtpddid12 toolsandhomeimprovement
sentences:
- >-
original ihip universal mlb licensed tampa bay devil rays noise
isolating earbuds 35mm navy blue white samsung galaxy tab 77
accessoriesgalaxy accessoriesclick now accessorygeekscom
cellphonesandaccessories
- >-
canon pixma mp160 combo pack genuine canon ink cartridges cartridges
inkrediblecouk combo pack contains 1 black 16ml and 1 colour 12ml
officeproducts
- >-
gmc 12v drill and impact driver twin pack pack 3233836 argos price
tracker pricehistorycouk gmc toolsandhomeimprovement date price 02
august 2017 10599 21 june 2017 9099 22 january 2016 9999 we started
tracking this product on 22 january 2016
- source_sentence: >-
throttle housing assembly 2002 bmw 325ci base coupe intake system page 3
genuine bmw automotive
sentences:
- >-
2017 bmw i3 94 ah with range extender california 91307 2015 extender
lease special promotion on rex electric a for 35000 per month west hills
automotive
- >-
oil filter spin on type pc 201 style 1969 bmw 1602 base coupe oil
circulation page 1 mahle automotive
- >-
throttle housing assembly 2002 bmw 325ci base coupe intake system page 3
continental vdo automotive
- source_sentence: >-
bracket without bushing for control arm front right lower 1990 bmw 325i
base coupe suspension shocks springs page 6 note does not come w mounting
bushing front right meyle automotive
sentences:
- >-
bracket without bushing for control arm front right lower 1990 bmw 325i
base coupe suspension shocks springs page 6 note does not come w
mounting bushing front right meyle automotive
- >-
mohawk industries wsk120 oak golden engineered hardwood flooring 5 wide
planks 1969 sf carton wsk120 cork bamboo tile more anderson 96in base
shoe accessory sale price sq ft oak golden oak engineered hardwood
flooring 5 wide planks 1969 sf carton mohawk industries wsk120 oak
golden oak engineered hardwood flooring 5 wide planks 1969 sf carton
wsk120 instock mohawk industries toolsandhomeimprovement
- >-
brizo towel ring charlotte products at efaucetscom towel ring charlotte
collection toolsandhomeimprovement
- source_sentence: >-
alpinestars 140 holdall gear bag alpinestars fl yellowredanthracite radar
flight gloves 35618185392x comfortable glove lightweight customized fit
silicone grip patterning on fingers for improved riding control included
items 2 gloves made with 46 synthetic suede 35 polyester 19 polyamide care
instructions do not wash bleach tumble dry iron clean single layer clarino
palm is breathable and offers excellent feel the bikes controls reinforced
thumb construction increases durability gusset flexibility innovative
stretch insert in area hand movement lever reinforcements third fourth
added abrasion resistance convenient slipon design a secure singlepiece
fabric upper gives perforated ergonomic chassis reduced material result
supremely lightweight alpinestars automotive
sentences:
- >-
cover with spring and heater elementfor carburetor gb 1980 volkswagen
jetta united states market engine carburetor versions 1 b 3 jb ghgb 1357
gb automotive
- >-
alpinestars 140 holdall gear bag alpinestars fl yellowredanthracite
radar flight gloves 35618185392x comfortable glove lightweight
customized fit silicone grip patterning on fingers for improved riding
control included items 2 gloves made with 46 synthetic suede 35
polyester 19 polyamide care instructions do not wash bleach tumble dry
iron clean single layer clarino palm is breathable and offers excellent
feel the bikes controls reinforced thumb construction increases
durability gusset flexibility innovative stretch insert in area hand
movement lever reinforcements third fourth added abrasion resistance
convenient slipon design a secure singlepiece fabric upper gives
perforated ergonomic chassis reduced material result supremely
lightweight alpinestars automotive
- >-
td 8000k xenon hid kit high beam 0910 mercedes benz cl600 c216 h7 xenon
hid lighting is only available on high end luxury cars you can convert
your stock halogens to super bright too by just connecting a few plug
and play connections then mounting the ballast in secure spot but with
this mercedes cl600 low watt 8000k td hid high beam conversion kit
experience supreme brightness expanded field of vision also our wattage
systems are backed by full one year warrantyplease note will not work if
cl600s headlights came equipped factory lights unlike cheaper market
more consistently without fading out like coated bulbs dousually
mercedes installations probably most common upgrades performed increase
headlight cl600 producing certain temperatures technology that uses
xenon gas charged bulb combination an electronic regulate current going
through it the resulting light 35 be up 3 times brighter than
traditional halogen bulbs kits reliably produce truer colored we offer
conversion kit short for intensity discharge automotive
SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-large-en-v1.5
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'alpinestars 140 holdall gear bag alpinestars fl yellowredanthracite radar flight gloves 35618185392x comfortable glove lightweight customized fit silicone grip patterning on fingers for improved riding control included items 2 gloves made with 46 synthetic suede 35 polyester 19 polyamide care instructions do not wash bleach tumble dry iron clean single layer clarino palm is breathable and offers excellent feel the bikes controls reinforced thumb construction increases durability gusset flexibility innovative stretch insert in area hand movement lever reinforcements third fourth added abrasion resistance convenient slipon design a secure singlepiece fabric upper gives perforated ergonomic chassis reduced material result supremely lightweight alpinestars automotive',
'alpinestars 140 holdall gear bag alpinestars fl yellowredanthracite radar flight gloves 35618185392x comfortable glove lightweight customized fit silicone grip patterning on fingers for improved riding control included items 2 gloves made with 46 synthetic suede 35 polyester 19 polyamide care instructions do not wash bleach tumble dry iron clean single layer clarino palm is breathable and offers excellent feel the bikes controls reinforced thumb construction increases durability gusset flexibility innovative stretch insert in area hand movement lever reinforcements third fourth added abrasion resistance convenient slipon design a secure singlepiece fabric upper gives perforated ergonomic chassis reduced material result supremely lightweight alpinestars automotive',
'td 8000k xenon hid kit high beam 0910 mercedes benz cl600 c216 h7 xenon hid lighting is only available on high end luxury cars you can convert your stock halogens to super bright too by just connecting a few plug and play connections then mounting the ballast in secure spot but with this mercedes cl600 low watt 8000k td hid high beam conversion kit experience supreme brightness expanded field of vision also our wattage systems are backed by full one year warrantyplease note will not work if cl600s headlights came equipped factory lights unlike cheaper market more consistently without fading out like coated bulbs dousually mercedes installations probably most common upgrades performed increase headlight cl600 producing certain temperatures technology that uses xenon gas charged bulb combination an electronic regulate current going through it the resulting light 35 be up 3 times brighter than traditional halogen bulbs kits reliably produce truer colored we offer conversion kit short for intensity discharge automotive',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 269,761 training samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 13 tokens
- mean: 68.94 tokens
- max: 1130 tokens
- min: 12 tokens
- mean: 70.35 tokens
- max: 1149 tokens
- Samples:
anchor positive tripp lite 25u 4post open frame rack cabinet square holes 1000lb capacity open frame rack tripp 25u prices cnet tripp lite otherelectronicstripp lite 25u 4post open frame rack cabinet square holes 1000lb capacity open frame rack tripp 25u specs cnet null tripp lite otherelectronicsheadlamp restoration kit philips 2000 bmw 323ci base coupe lights and lenses page 6 note removes yellowing and haze of plastic headlight lenses restoring likenew condition and finish professional results in under 30 minutes can be used on headlights taillights turn signals and reflective lens covers with uv coating technology one kit restores two headlights contains qty 1 pretreatment 1 cleanerpolish 1 shine restorerpreserver 3 sandpaper 600 1500 2000 grit 10 applicator polish cloths 1 pair of vinyl gloves philips automotiveheadlamp restoration kit philips 1996 bmw 318i base convertible lights and lenses page 6 note removes yellowing and haze of plastic headlight lenses restoring likenew condition and finish professional results in under 30 minutes can be used on headlights taillights turn signals and reflective lens covers with uv coating technology one kit restores two headlights contains qty 1 pretreatment 1 cleanerpolish 1 shine restorerpreserver 3 sandpaper 600 1500 2000 grit 10 applicator polish cloths 1 pair of vinyl gloves philips automotivehose clamp 132146 mm range 12 width spring type 1991 bmw 325i base coupe cooling system miscellaneous page 1 mubea automotivehose clamp 132146 mm range 12 width spring type 1994 bmw 325i base convertible cooling system miscellaneous page 1 mubea automotive - Loss:
CachedMultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 67,441 evaluation samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 11 tokens
- mean: 74.02 tokens
- max: 693 tokens
- min: 15 tokens
- mean: 74.68 tokens
- max: 812 tokens
- Samples:
anchor positive bulb dashboard instruments with black socket base 12v 12w 1995 bmw 318ti hatchback lights and lenses page 3 genuine bmw automotivebulb dashboard instruments with black socket base 12v 12w 1999 bmw 323is coupe gauges miscellaneous page 1 osramsylvania automotivecanon pixma mp282 high capacity black compatible ink cartridge ink volumeremanufactured pg512 black 18ml 1 cartridge 18ml officeproductscanon pixma mp282 high capacity black compatible ink cartridge cartridges inkrediblecouk 1 black ink cartridge 18ml officeproductsoring for camshaft position sensor 17 x 3 mm 2001 bmw 325i base wagon camshafts timing chains page 1 note 17 x 3mm uro automotiveoring for crankshaft sensor 17 x 3 mm 2000 bmw 323ci base coupe sensors page 5 note 17 x 3mm uro automotive - Loss:
CachedMultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepslearning_rate: 1e-05num_train_epochs: 2warmup_ratio: 0.1fp16: Trueauto_find_batch_size: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Truefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.2076 | 7000 | 0.012 | 0.0057 |
| 0.4152 | 14000 | 0.0044 | 0.0040 |
| 0.6228 | 21000 | 0.0038 | 0.0040 |
| 0.8303 | 28000 | 0.0033 | 0.0028 |
| 1.0379 | 35000 | 0.002 | 0.0025 |
| 1.2455 | 42000 | 0.0012 | 0.0022 |
| 1.4531 | 49000 | 0.0008 | 0.0021 |
| 1.6607 | 56000 | 0.0005 | 0.0021 |
| 1.8683 | 63000 | 0.0004 | 0.0020 |
Framework Versions
- Python: 3.10.13
- Sentence Transformers: 3.0.1
- Transformers: 4.44.0
- PyTorch: 2.2.1
- Accelerate: 0.33.0
- Datasets: 2.21.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CachedMultipleNegativesRankingLoss
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}