Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Paper
•
2101.06983
•
Published
•
1
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'paint sealant sonax profiline polymer net shield 75 ml aerosol can 1994 bmw 318is base coupe miscellaneous page 24 note innovative surface protection based on hybrid polymers protects the paintwork by means of a resistant network made from organic and inorganic components can be applied quickly easily intensively freshens up paint color produces silky smooth with an outstanding drip off effect one 75 ml should complete average size car produced by sonax identifiers is 223000m941 category of automotive',
'paint sealant sonax profiline polymer net shield 75 ml aerosol can 1991 bmw 325i base convertible miscellaneous page 23 note innovative surface protection based on hybrid polymers protects the paintwork by means of a resistant network made from organic and inorganic components can be applied quickly easily intensively freshens up paint color produces silky smooth with an outstanding drip off effect one 75 ml should complete average size car produced by sonax identifiers is 223000m941 category of automotive',
'honeywell accessories for terminal cod99exmb12 honeywell cod871238012 honeywell dolphin 99ex mobile base vehicle kit charging cradle rs232 universal mounting bracket and 12v cigarette lighter power adapter produced by honeywell metrologic identifiers is 99exmb12 category of computersandaccessories',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
honeywell hand held products dolphin 99509951 series mobile computer usb cable 6 ft 18m 80000355e usb cable 6 ft 18m identifiers is 80000355e category of computersandaccessories |
hand held usb cable 6 ft hand ft 80000355e scanner accessories cdwcom hand held products is the leading provider of imagebased data collection solutions for mobile wireless and transaction processing applications to end users throughout world by investing in hhp products its customers are able reduce costs improve service position their companies future growth identifiers is 26121604 category of computersandaccessories |
intake boot air mass sensor to throttle housing 1995 bmw 318i base convertible intake system page 2 note from 0994 produced by oem identifiers is 13711247829m58 category of automotive |
intake boot air mass sensor to throttle housing 1995 bmw 318i base convertible intake system page 2 produced by crp identifiers is 13711247829int category of automotive |
blue sky panorama with transparent clouds vector image sky images over 150 000 vector blue sky panorama with transparent clouds vector background image identifiers is 15266707 category of officeproducts |
blue sky panorama with transparent clouds vector image images within landscapes nature over 55 000 vector blue sky panorama with transparent clouds vector background image identifiers is 15266707 category of officeproducts |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
heater hose inlet from cylinder head to water valve 1997 bmw 318i base sedan heater system page 3 produced by genuine bmw identifiers is 64211394295boe category of automotive |
heater hose inlet from cylinder head to water valve 1996 bmw 318i base convertible heater system page 3 produced by genuine bmw identifiers is 64211394295boe category of automotive |
harris harris group inc group 1 full quote netdaniacom produced by source nasdaq identifiers is isinus4138331040 category of toolsandhomeimprovement |
harris harris group inc 1 statistics netdaniacom group produced by source nasdaq identifiers is isinus4138331040 category of toolsandhomeimprovement |
swiffer dusters with extendable handledusters plastic handle extends to 3 ft 1 per kit handledusters ft kitpag82074 buy online at janeice products identifiers is pag82074 category of toolsandhomeimprovement key specifications are weight per case std pkg quantity package one handle and three dusters description includes item cube 008276 upc code 037000447504 pack 00037000820741 length 092 width 022 height 042 0476 |
6 pack value bundle pag82074 dusters plastic handle extends to 3 ft 1 dusters per kitus feather page 5 the janitorial marketus now its easier than ever to get those hardtoreach places pivoting head can be adjusted and locked into place for cleaning angled surfaces such as ceiling fans cabinet corners baseboards refill dusters sold separately one handle three per box bristle material fiber color white plastic greenus produced by pag82074us identifiers is pag82074 category of toolsandhomeimprovement |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepslearning_rate: 1e-05num_train_epochs: 2warmup_ratio: 0.1fp16: Trueauto_find_batch_size: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Truefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.1990 | 7000 | 0.0083 | 0.0029 |
| 0.3981 | 14000 | 0.0026 | 0.0019 |
| 0.5971 | 21000 | 0.0015 | 0.0014 |
| 0.7962 | 28000 | 0.0013 | 0.0011 |
| 0.9952 | 35000 | 0.0013 | 0.0010 |
| 1.1943 | 42000 | 0.0008 | 0.0010 |
| 1.3933 | 49000 | 0.0005 | 0.0009 |
| 1.5924 | 56000 | 0.0003 | 0.0009 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
sentence-transformers/all-mpnet-base-v2