Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Paper • 2101.06983 • Published • 2
This is a sentence-transformers model finetuned from google/embeddinggemma-300m on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("kangsyahrul/xls-maya-retrieval-embeddinggemma-v1")
# Run inference
queries = [
"Berapa kuota untuk negara transit pada semua paket Umroh Plus Gift?",
]
documents = [
'Paket Tambahan: Umrah Plus Booster dan Gift\nPaket tambahan dan fitur gift untuk Paket Umrah Plus XL\n### Paket Umrah Plus Booster\n\n| Package | Kuota Arab Saudi | Kuota Arab Saudi + Negara Transit | Price | Masa aktif |\n|---------|------------------|-----------------------------------|-------|------------|\n| Internet Umroh Plus + Booster 1GB (di Arab Saudi) | 1GB | - | Rp 35.000 | Mengikuti masa aktif paket utama (Paket Umroh Plus) |\n| Internet Umroh Plus + Booster 1GB (di Arab Saudi & Negara Transit) | - | 1GB | Rp60.000 | Mengikuti masa aktif paket utama (Paket Umroh Plus) |\n\n### Umroh Plus Gift\n\n| Paket | Quota Arab Saudi | Quota Arab Saudi + Negara Transit | Price | Masa Aktif (Hari) |\n|-------|------------------|-----------------------------------|-------|--------------------| \n| Internet Umrah Plus 6GB | 5GB | 1GB | Rp245.000 | 10 |\n| Internet Umrah Plus 10GB | 9GB | 1GB | Rp325.000 | 15 |\n| Internet Umrah Plus 10GB | 9GB | 1GB | Rp350.000 | 20 |',
'Syarat dan Ketentuan Paket Edukasi\nKetentuan lengkap untuk paket Edukasi XL termasuk daftar lengkap website dan aplikasi yang didukung\n### Syarat & Ketentuan Paket Edukasi\n\n1. Berlaku untuk pelanggan prabayar XL.\n2. Kuota paket dapat digunakan di semua jaringan.\n3. Jika kuota paket sudah habis, maka pelanggan akan dikenakan tarif dasar internet.\n4. Tidak berlaku perpanjangan otomatis jika masa berlaku paket sudah selesai.\n5. Daftar aplikasi yang bisa diakses dengan kuota Paket Edukasi:\n • **Website Pemerintah:** Rumah Belajar Kemendikbud, Spada Indonesia Kemendikbud\n • **Aplikasi belajar online:** Ruangguru, Zenius, Sekolahmu, Udemy, Skill Academy\n • **Portal Universitas:** UI, ITB, UNPAD, UNDIP, Universitas Atma Jaya, Universitas Hasannudin, UGM, Universitas Negeri Semarang, ITS, Universitas Udayana, Universitas Brawijaya, Universitas Airlangga, Universitas Tanjung Pura, Universitas Sebelas Maret, Universitas Nusa Cendana Kupang, Institut Pertanian Bogor, Universitas Muhamadyah Surakarta, Universitas Islam Negeri Sunan Gunung Djati Bandung, UNIKOM, UNIKA Soegijapranata Semarang, LSPR Communication & Business Institute, Universitas Islam Bandung, Universitas Islam Sultan Agung Semarang, Universitas Andalas, Politeknik Negeri Bandung, Institut Seni Yogyakarta, Universitas Jenderal Soedirman, Universitas Teknologi Yogyakarta, IIK Bhakti Wiyata, STEKOM, Universitas Aisyiyah Yogyakarta, Universitas Politenik APP Jakarta, STMIK Indonesia Padang, Universitas Pamulang, Universitas Terbuka, Politeknik Negeri Batam.',
'Viu Premium - Syarat Ketentuan dan Detail Paket Lengkap\nInformasi lengkap paket Viu Premium XL dengan tiga pilihan durasi berlangganan dan kuota data yang disertakan\n# Viu Premium\n\n## Syarat dan Ketentuan\n\n• Akses premium untuk menonton drama Korea, serial original VIU, Anime, Asian Series dalam format HD dan dilengkapi subtitles.\n\n## Detail Paket\n\n| Paket | Harga |\n|-------|-------|\n| VIU 30 Hari + 5GB | Rp. 54,000 |\n| VIU 7 Hari + 3GB | Rp. 7,500 |\n| VIU 1 Hari + 500MB | Rp. 1,500 |\n\n**Note:** Harga yang tertera merupakan harga normal dan dapat berubah sewaktu-waktu, untuk harga promo silakan cek pada aplikasi myXL.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.6162, 0.0363, 0.0146]])
query and passage| query | passage | |
|---|---|---|
| type | string | string |
| details |
|
|
| query | passage |
|---|---|
Kuota Utama bisa dipakai di jaringan apa saja? |
Syarat dan Ketentuan Lengkap Paket Xtra Kuota Utama |
Bisakah nonton drama Korea dengan Vidio Platinum? |
Vidio Platinum - Syarat Ketentuan dan Akses Premium |
Di mana saya bisa membeli kartu perdana XL? |
Cara Pembelian Kartu Perdana SIM Fisik dan eSIM |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 8,
"gather_across_devices": false
}
query and passage| query | passage | |
|---|---|---|
| type | string | string |
| details |
|
|
| query | passage |
|---|---|
Berapa lama masa aktif Bonus Kuota Aplikasi? |
Ketentuan Bonus Kuota Aplikasi dan Kuota Lokal |
Setelah pilih menu Beli Paket, ke mana saya harus pergi selanjutnya? |
Cara Membeli Voucher Garena Shell di Aplikasi myXL |
Berapa persen biaya pinjam yang dikenakan untuk Pulsa Darurat? |
Syarat dan Ketentuan Lengkap Pulsa Darurat AXIS |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 8,
"gather_across_devices": false
}
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 64learning_rate: 2e-05warmup_ratio: 0.1fp16: Trueprompts: {'query': 'task: search result | query: ', 'passage': 'title: none | text: '}batch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: {'query': 'task: search result | query: ', 'passage': 'title: none | text: '}batch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.2632 | 20 | 0.6088 | - |
| 0.5263 | 40 | 0.4443 | - |
| 0.6579 | 50 | - | 0.4595 |
| 0.7895 | 60 | 0.4892 | - |
| 1.0526 | 80 | 0.3892 | - |
| 1.3158 | 100 | 0.3094 | 0.4127 |
| 1.5789 | 120 | 0.2771 | - |
| 1.8421 | 140 | 0.2471 | - |
| 1.9737 | 150 | - | 0.3795 |
| 2.1053 | 160 | 0.2153 | - |
| 2.3684 | 180 | 0.172 | - |
| 2.6316 | 200 | 0.1712 | 0.3683 |
| 2.8947 | 220 | 0.1693 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
google/embeddinggemma-300m