Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup
Paper
• 2101.06983 • Published
• 2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'MPNetModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'dear sir mam i am trying to register udyam with my pan but error showing udyam registration has already done through this pan and i have not registered earlier so please guide me aadhaar <uam_no> pan <pan_no> mobile <phone_no> issue clarification on existing udyam registration context the user is requesting clarification as the udyam registration portal indicates that registration has already been done through the pan although the user states that no registration was made. details - aadhar no <NUM> pan no gnips2021g mobile no <NUM>',
'UAM/Udyam Registration/Certificate related issues. Existing / Unauthorized UDYAM Registration Against PAN. this category refers to grievances where an entrepreneur discovers that a udyam registration already exists against their pan either due to duplicate registration or because someone else created the registration without their authorization. since pan is used as a key identifier for enterprise registration the presence of an existing registration can prevent the legitimate owner from creating a new one or managing the enterprise details. grievances under this category usually include complaints about duplicate registrations created for the same enterprise or multiple registrations linked to the same pan. some business owners report that when they attempt to register their enterprise the system indicates that a registration already exists even though they are unaware of creating one earlier. in other cases entrepreneurs may find that an employee consultant former partner or third party registered the enterprise using the business pan without informing the owner. there may also be situations where an earlier registration contains incorrect enterprise information leading to confusion about the valid record. such grievances are generally raised by business proprietors partners of partnership firms directors of companies or authorized representatives responsible for registering the enterprise under msme. these complaints may also be submitted by compliance managers accountants or consultants who are attempting to complete the msme registration process for the business but encounter an existing record linked to the pan. the purpose of raising this grievance is to identify the existing registration verify its legitimacy and resolve conflicts arising from duplicate or unauthorized registrations associated with the enterprise s pan.',
'Marketing and Skilling. National SC ST HUB. national sc-st hub nssh is a central sector scheme launched in <NUM> by the ministry of micro small and medium enterprises and implemented by the national small industries corporation to empower scheduled caste and scheduled tribe entrepreneurs and strengthen their participation in the msme ecosystem. the scheme focuses on capacity building market access financial facilitation and handholding support while also operationalizing the mandatory <NUM> procurement target for sc st owned mses under the public procurement policy for mses <NUM> . through a network of national sc-st hub offices across the country the hub assists eligible sc st entrepreneurs holding at least <NUM> ownership and control in activities such as udyam and gem registration participation in government tenders access to credit and skill upgradation. financial support is provided in the form of reimbursements for testing and certification charges from recognized laboratories bank loan processing and bank guarantee fees membership fees of export promotion councils onboarding costs for e-commerce and government procurement platforms and fees for short-term skill and management training programs at reputed institutions. by reducing entry barriers and providing structured handholding nssh aims to enhance competitiveness ensure inclusive growth and enable sc st entrepreneurs to scale up operations and integrate with formal supply chains. examples of grievances reported under the scheme include rejection of reimbursement claims where testing or certification expenses exceed the prescribed financial ceiling despite compliance with quality standards blockage of financial assistance due to delays or discrepancies in caste certificate verification even when enterprises are otherwise registered as sc st-owned instances where sc st msmes fail to secure tenders despite the mandated procurement quota because of non-compliance by procuring cpses partial reimbursement of approved training or capacity-building expenses owing to scheme-specific limits leading to out-of-pocket costs for entrepreneurs and gaps in timely support from local nssh offices particularly in remote or north-eastern regions affecting onboarding to procurement portals and access to scheme benefits.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7751, 0.1988],
# [0.7751, 1.0000, 0.2777],
# [0.1988, 0.2777, 1.0000]])
EmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | nan |
| spearman_cosine | nan |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
sub - request for clarification on msme dev act . dear sir . your august office is kindly requested to define the specific word _x0080__x009c_tender x0080__x009d as referred in the public procurement policy for micro and small enterprises mse _x0080__x0099_s order gazetted notification no. d.l.- dtd. . . sub sec heading as price quotation in tenders and further word _x0080__x009c_rate contract x0080__x009d as referred in sub sec developing micro and small enterprises vendors before substitution dt. . . as extracted. _x0080__x009c_7. developing micro and small enterprise vendors. _x0080__x0093_the central ministries or departments or public sector undertakings shall take necessary steps to develop appropriate vendors by organizing vendor development programmes or buyer-seller meets and entering into rate contract with micro and small enterprises for a specified period in respect of periodic requirements also. _x... |
Policy and Schemes. Related to Public Procurement by PSUs. this category pertains to grievances related to public sector undertakings psus violating or diluting mandatory msme procurement norms under the public procurement policy for msmes and related guidelines including gem . the scope encompasses cases where psus fail to meet prescribed msme procurement quotas deny msmes their l1 price-matching rights bypass eligible msme vendors despite valid registration or design tenders with disproportionate eligibility conditions that effectively exclude msmes. key issues and scenarios within this category include failure to meet msme procurement quotas denial of l1 price-matching rights to msmes bypassing eligible msme vendors despite valid registration designing tenders with disproportionate eligibility conditions such as excessive turnover requirements prior psu experience requirements high emd pbg requirements unnecessary technical specifications post-award payment delays including wi... |
banks approved clcs-tu loan for new machines but subsidy claim is rejected over minor tech list mismatch despite empanelled vendor. this ties up my finance without aid. release subsidy and simplify verification for tech upgrades.special clcs-tu for sc st promises subsidy but nodal agency delays processing my plant machinery finance claim for months with extra document demands. please fast-track special aid and approve higher subsidy for sc st beginners. issue delayed subsidy claim and non-approval under clcs-tu and special clcs for sc st context the user is reporting delayed subsidy claim and non-approval under clcs-tu and special clcs for sc st schemes citing minor technical list mismatch and extra document demands and requesting simplification of verification and fast-tracking of special aid. details - issue with clcs-tu loan minor tech list mismatch issue with special clcs for sc st delayed processing and extra document demands requested action simplify verification and ... |
Starter, Credit and Finance. Credit Linked Capital Subsidy for Technology Upgradation (CLCS- TU) & Special CLCS for SC&ST. credit linked capital subsidy scheme for technology upgradation clcss tu and the special clcs for sc st entrepreneurs is a flagship technology modernisation program of the ministry of micro small and medium enterprises designed to help micro and small manufacturing enterprises upgrade to proven state-of-the-art technologies. under the standard clcss tu eligible mses receive an upfront capital subsidy of on institutional term loans used for purchasing approved plant and machinery subject to a maximum subsidy of lakh on an eligible investment ceiling of crore across notified sub-sectors. the scheme is implemented through nodal agencies such as small industries development bank of india national bank for agriculture and rural development and national institute for entrepreneurship and small business development with technical vetting by expert bodies... |
i am unable to change enterprise name or trade name in my udayam certificate pls give proper solution issue update of enterprise trade name in udyam certificate context the user is requesting an update of the enterprise trade name in the udyam certificate. details - enterprise trade name update required |
UAM/Udyam Registration/Certificate related issues. Update Company/Owner Name Details. this category includes grievances related to corrections or updates to the name of the enterprise or the name of the owner associated with a udyam registration. accurate naming details are important for maintaining correct enterprise records and ensuring that the information recorded in the registration reflects the official business identity. grievances under this category typically arise when the name of the enterprise or the owner s name recorded during registration contains an error or needs to be updated due to changes in the business structure. for example the enterprise name may have been entered incorrectly during registration or the owner s name may not match official identification documents. in some cases the enterprise name may change due to business rebranding conversion of the business structure or correction of typographical errors made during the registration process. users may also re... |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 32,
"gather_across_devices": false
}
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 5fp16: Truemulti_dataset_batch_sampler: round_robindo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Truebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | spearman_cosine |
|---|---|---|
| 1.0 | 3 | nan |
| 2.0 | 6 | nan |
| 3.0 | 9 | nan |
| 4.0 | 12 | nan |
| 5.0 | 15 | nan |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
sentence-transformers/all-mpnet-base-v2