Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Paper
•
2101.06983
•
Published
•
1
This is a sentence-transformers model finetuned from PaDaS-Lab/xlm-roberta-base-msmarco. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Wo gibt es die besten Casino Gewinnquoten?',
'Wer die durchschnittlich besten Chancen im Spielcasino sucht, der sollte sich an Spielautomaten wagen. Hier schwanken die Auszahlungsquoten zwar auch, aber fangen bei besseren Spielen bei 90 Prozent an und können auch bis fast 99 Prozent gehen. Beim Roulette hingegen können die Quoten nie über 49 Prozent gehen.',
'Im Großen und Ganzen gibt es viele Möglichkeiten als Casino-Spiele mit den besten Gewinnchancen. Bei den Spielautomaten ist es am besten, ein Spiel zu wählen, das eine Rendite von über 97 % bietet. Vegas Plus stellt eine große Anzahl an solchen Spielen an. So kann man sagen, dass es an Spielautomaten echtes Geld gewinnen möglich ist. Darüber wurde es sich detailliert im Kapitel “Spiele und Software” erwähnt.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6503, 0.5791],
# [0.6503, 1.0000, 0.7740],
# [0.5791, 0.7740, 1.0000]])
sentence_0, sentence_1, sentence_2, sentence_3, sentence_4, and sentence_5| sentence_0 | sentence_1 | sentence_2 | sentence_3 | sentence_4 | sentence_5 | |
|---|---|---|---|---|---|---|
| type | string | string | string | string | string | string |
| details |
|
|
|
|
|
|
| sentence_0 | sentence_1 | sentence_2 | sentence_3 | sentence_4 | sentence_5 |
|---|---|---|---|---|---|
Czy mus nie pozostawia tłustej warstwy na skórze? |
Nasz mus do ciała - Len-Konopie - skomponowany jest w oparciu o oleje i masła roślinne - otula on skórę natłuszczającą warstwą ochronną, która potrzebuje czasu, aby się wchłonąć. Aplikowanie musu na nieco wilgotną (np. po kąpieli/prysznicu) skórę sprawi, że mus wchłonie się szybciej, pozostawiając skórę nawilżoną i miękką w dotyku. |
Nie, ma niską zawartość lipidów i dlatego nie pozostawia tłustej warstwy. |
Tak, oczywiście! Musy wchłaniają się bardzo szybko, pozostawiając skórę nawilżoną i odżywioną. W celu łatwiejszej aplikacji, należy nakładać niewielką (są bardzo wydajne!), uprzednio rozprowadzoną w palcach ilość produktu na wilgotne ciało - najlepiej tuż po kąpieli czy prysznicu :) |
Tak! Dzięki zawartości oleju z konopi, który zawiera ok. 75% niezbędnych nienasyconych kwasów tłuszczowych mus wyróżnia się właściwościami kojącymi i łagodzącymi podrażnienia dla skór suchych, szorstkich czy atopowych właśnie :) Z uwagi na zawartość olejków eterycznych w składzie, tym z Państwa, którzy borykają się z atopią, zalecałybyśmy wcześniejsze skonsultowanie składu z lekarzem dermatologiem. |
Produkt ma lekką kremową strukturę, natychmiast bez osadów wchłania się w głębokie warstwy skóry właściwej, nie pozostawia tłustego połysku i lepkiego filmu. |
É precisa de se cadastrar e-mail em Eletro Angeloni? |
Sim, quando fazam comprars em Eletro Angeloni pode se registrar na página de venda.Eletro Angeloni queria oferecer aos clientes uma melhor experiência de compra e serviços, lançou benefícios de associação especialmente. Para obter benefícios específicos para membros, você pode se registrar como um membro Eletro Angeloni através do seguinte endereço de e-mail. |
Sim, é preciso se cadastrar por e-mail. Torne-se um membro de Britania, você não perderá a chance de obter Cupom de Desconto Britania. Se você quiser economizar dinheiro ou aprender as últimas notícias da marca, basta clicar na página Britania para se registrar. |
Sim, é preciso se cadastrar por e-mail. Depois de se registrar em loja.colormaq.com.br, você pode obter as informações mais recentes da marca em tempo hábil. E o Colormaq para membros de e-mail registrados ocasionalmente emitirá benefícios, permitindo que você desfrute de ótimos descontos. |
Sim, é preciso se cadastrar por e-mail. Torne-se um membro de Avon, você não perderá a chance de obter Código de Desconto Avon. Se você quiser economizar dinheiro ou aprender as últimas notícias da marca, basta clicar na página Avon para se registrar. |
Preciso, é essencial se cadastrar por e-mail quando fazam comprars em Prego E Martelo. Os membros Prego E Martelo podem desfrutar de serviços melhores e mais abrangentes e o atendimento de alta qualidade. Você pode se registrar como membro em pregoemartelo.com.br, permitindo que você aproveite o máximo de benefícios. |
Does anyone at Squlpt speak Spanish? |
Yes, we have Spanish-speaking team members who will be more than happy to communicate with you in Spanish or English. |
No. Attendants at the most famous tourist attractions, such as Sugarloaf Mountain and Christ the Redeemer, generally speak at least basic English and Spanish (due to the influx of tourists from other Latin American countries). However, most Brazilians don’t speak English, although you may be able to get by with Spanish, as the two languages are similar. |
Yes, we have opticians and eye doctors who speak Spanish. Please contact a store for an appointment so we can make sure a Spanish-speaking staff member is available. |
No, you don’t need to bring anyone along to speak with your attorney in Spanish. Most of our attorneys and staff are at least bilingual. We actually have people who speak 6+ different languages in our office. |
We don't currently have anyone here speaking Spanish. The window can be placed anywhere on any of the sides as it is simply part of a 30 inch panel. This shed cannot be ordered with no skylight. The skylight simply comes with it packaged as sold. |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 32,
"gather_across_devices": false
}
per_device_train_batch_size: 128per_device_eval_batch_size: 128num_train_epochs: 1fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.0502 | 500 | 1.8861 |
| 0.1003 | 1000 | 0.895 |
| 0.1505 | 1500 | 0.8331 |
| 0.2007 | 2000 | 0.7999 |
| 0.2508 | 2500 | 0.7721 |
| 0.3010 | 3000 | 0.7555 |
| 0.3512 | 3500 | 0.7459 |
| 0.4013 | 4000 | 0.7334 |
| 0.4515 | 4500 | 0.7175 |
| 0.5017 | 5000 | 0.7186 |
| 0.5518 | 5500 | 0.7109 |
| 0.6020 | 6000 | 0.7 |
| 0.6522 | 6500 | 0.6953 |
| 0.7023 | 7000 | 0.6951 |
| 0.7525 | 7500 | 0.6878 |
| 0.8026 | 8000 | 0.6847 |
| 0.8528 | 8500 | 0.6793 |
| 0.9030 | 9000 | 0.684 |
| 0.9531 | 9500 | 0.6803 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
FacebookAI/xlm-roberta-base