Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Paper
•
2101.06983
•
Published
•
1
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'b260iunvhp000i 768386242246 20 30 where number of lamps is 1 linear fluorescent ballasts and by wattage x 75w 99w bulbscom universal electronic ballast 120v to 277v for 2 f96t12 universal brand b260iunvhp000i toolsandhomeimprovement',
'b260iunvhp000i 768386242246 10 50 where length is 10 under 18 bulbscom electronic t12 linear fluorescent ballasts universal electronic ballast 120v to 277v for 2 f96t12 universal brand b260iunvhp000i toolsandhomeimprovement',
'danze 24 double towel bar danze products at efaucetscom towel bars bathroom accessories danze 24 double towel bar parma collection solid brass construction easy to install mounting hardware included matching faucet collection d446612bn toolsandhomeimprovement',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
clever lever extra giga punch scallop circle 35 inches clever wholesale darice this clever lever extra giga punch produces a clearcut scallop circle the craft punch is ideal for embellishing scrapbooks greeting cards invitations programs and many more paper crafts the scalloped circle is 35 inches in size 1 craft punch per package lvxgcp65 officeproducts |
clever lever extra giga punch scallop circle 35 inches clever wholesale darice this clever lever extra giga punch produces a clearcut scallop circle the craft punch is ideal for embellishing scrapbooks greeting cards invitations programs and many more paper crafts the scalloped circle is 35 inches in size 1 craft punch per package lvxgcp65 officeproducts |
strut front right shocks springs page 1 2002 bmw 325i base sedan suspension genuine bmw 31312282460boe automotive |
strut front right shocks springs page 1 2002 bmw 325i base sedan suspension note only for cars with sport suspension and m sport package sachs 31312282460m10 automotive |
herrold 40 drawer chest in dark walnutmango wood 792977257388 arreton 46quote in washed white oakantique brass sale home lighting fixtures lamps more online symbolizing achievement and rank the shield shape of this six drawer chest bears both historical and design significance built with craftsmens detail from dark walnutstained mango wood and mahogany veneers chest features curved sides smooth uttermost 25738upc792977257388 toolsandhomeimprovement |
herrold 40 drawer chest in dark walnutmango wood 792977257388 malthus 31quote in aged parchmentreclaimed mahogany sale home lighting fixtures lamps more online symbolizing achievement and rank the shield shape of this six drawer chest bears both historical and design significance built with craftsmens detail from dark walnutstained mango wood and mahogany veneers chest features curved sides smooth uttermost 25738upc792977257388 toolsandhomeimprovement |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
retro 70s furniture set armchairs chairs and vector image furniture images over 41 000 retro 70s furniture set armchairs chairs and sofas vector illustration eps 8 vector image 14149273 officeproducts |
retro 70s furniture set armchairs chairs and vector image setting images over 12 million retro 70s furniture set armchairs chairs and sofas vector illustration eps 8 vector image 14149273 officeproducts |
hp designjet 70 cartridges for ink jet printers quillcom ink volume 130 mlthis cartridge is not compatible with hp designjet t620 24in photo printer hp photosmart pro b9180 printer hp photosmart pro b8850 photo printer hp photosmart pro b8800 photo printerfaderesistant color provides superior results and brilliant truetolife images that last for generations 901680441 officeproducts |
hp designjet z2100 44 in cartridges for ink jet printers quillcom ink volume 130 mlthis cartridge is not compatible with hp designjet t620 24in photo printer hp photosmart pro b9180 printer hp photosmart pro b8850 photo printer hp photosmart pro b8800 photo printerfaderesistant color provides superior results and brilliant truetolife images that last for generations 901680441 officeproducts |
suspension strut assembly shocks springs page 1 1996 bmw 318i base convertible suspension note front left w sport suspension front left bilstein touring class 22172518int automotive |
suspension strut assembly shocks springs page 1 1997 bmw 318is base coupe suspension note front left w sport suspension front left bilstein touring class 22172518int automotive |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepslearning_rate: 1e-05num_train_epochs: 2warmup_ratio: 0.1fp16: Trueauto_find_batch_size: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Truefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.1990 | 7000 | 0.0076 | 0.0027 |
| 0.3981 | 14000 | 0.0022 | 0.0019 |
| 0.5971 | 21000 | 0.0016 | 0.0013 |
| 0.7961 | 28000 | 0.0013 | 0.0011 |
| 0.9951 | 35000 | 0.0012 | 0.0008 |
| 1.1942 | 42000 | 0.0007 | 0.0007 |
| 1.3932 | 49000 | 0.0004 | 0.0009 |
| 1.5922 | 56000 | 0.0004 | 0.0007 |
| 1.7912 | 63000 | 0.0003 | 0.0006 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
Alibaba-NLP/gte-large-en-v1.5