Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 64, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'dear sir mam trying register udyam pan error showing udyam registration already done pan registered earlier please guide aadhaar uam pan pan mobile phone clarification existing udyam registration user requesting clarification udyam registration portal indicates registration already done pan although user states registration made',
'UAM/Udyam Registration/Certificate related issues. category encompasses grievances related identity creation verification eligibility validation micro small medium enterprises msmes udyam registration system udyam registration system serves foundational gateway msmes access central state government schemes bank loans subsidies credit guarantees public procurement benefits statutory advantages scope purpose category covers issues directly impact msme ecosystem including registration related issues udyam portal errors issued udyam registration certificate migration related grievances legacy udyog aadhaar memorandum uam system udyam portal registration related issues registration remains pending failure generate registration number system validation errors despite correct data submission causes backend verification delays pan aadhaar validation errors system downtime incomplete synchronization tax identity databases errors issued udyam registration certificate incorrect inconsistent details including enterprise classification micro small medium gst number address business activity ownership information impact rejection loan applications denial scheme benefits disqualification government tenders migration related grievances failed migration attempts duplicate already registered system errors loss enterprise data inability link historical uam records impact disruption',
'UAM/Udyam Registration/Certificate related issues. category encompasses grievances related identity creation verification eligibility validation micro small medium enterprises msmes udyam registration system udyam registration system serves foundational gateway msmes access central state government schemes bank loans subsidies credit guarantees public procurement benefits statutory advantages scope purpose category covers issues directly impact msme ecosystem including registration related issues udyam portal errors issued udyam registration certificate migration related grievances legacy udyog aadhaar memorandum uam system udyam portal registration related issues registration remains pending failure generate registration number system validation errors despite correct data submission causes backend verification delays pan aadhaar validation errors system downtime incomplete synchronization tax identity databases errors issued udyam registration certificate incorrect inconsistent details including enterprise classification micro small medium gst number address business activity ownership information impact rejection loan applications denial scheme benefits disqualification government tenders migration related grievances failed migration attempts duplicate already registered system errors loss enterprise data inability link historical uam records impact disruption',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7274, 0.7274],
# [0.7274, 1.0000, 1.0000],
# [0.7274, 1.0000, 1.0000]])
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
availed pmegp subsidy paper carry bag manufacturing unit 2020 iob bank kottayam three years paid loan amount correctly time without failure inspection needed sanction subsidy done yet since keen paying loan bringing burden financially please needful arrange inspection udyam reg kl 07 0005226 thanking non conduct inspection pmegp subsidy msme dfo user reporting despite timely loan repayment inspection required sanctioning pmegp subsidy paper carry bag manufacturing unit conducted causing financial burden requesting assistance earliest possible inspection |
Related to MSME-DFO. category encompasses grievances related field level execution failures msme development facilitation offices dfos responsible facilitating msme schemes loans subsidies services scope category includes field level execution failures non responsive dfo officers failure provide guidance documentation procedures inaction queries submitted champions physical visits inspection delays inconsistencies postponed repeatedly rescheduled site visits delayed inspection reports unnecessary multiple inspections stall loan disbursement subsidy release local facilitation coordination failures misrouting applications offices lack facilitation land utilities approvals unavailability promised local support services poor coordination dfos banks psus state nodal officers resulting projects remaining stuck despite eligibility prior approvals example issues dfo officials responding phone calls emails regarding subsidy applications guidance provided required documents site inspection msme ... |
unable edit district udhyam certificate please help editing district udyam certificate user requesting assistance edit district udyam certificate |
UAM/Udyam Registration/Certificate related issues. category encompasses grievances related identity creation verification eligibility validation micro small medium enterprises msmes udyam registration system udyam registration system serves foundational gateway msmes access central state government schemes bank loans subsidies credit guarantees public procurement benefits statutory advantages scope purpose category covers issues directly impact msme ecosystem including registration related issues udyam portal errors issued udyam registration certificate migration related grievances legacy udyog aadhaar memorandum uam system udyam portal registration related issues registration remains pending failure generate registration number system validation errors despite correct data submission causes backend verification delays pan aadhaar validation errors system downtime incomplete synchronization tax identity databases errors issued udyam registration certificate incorrect inconsistent detai... |
loanagreement isbl00910729978dated26 09 2024 loan payment pending since 29 sep 2024 hdfc bank returned cheque stating alteration rbi guidelines pli nodal agencies contact numbers found service unable connect attached loan agreement pdf reference please support get resolution pending since 29 sep 2024 non receipt loan payment dcmsme scheme user reporting non receipt loan payment since 29 09 2024 citing hdfc bank return cheque alteration rbi guidelines requesting assistance resolving issue |
Related to DCMSME Scheme. category related grievances dcmsme scheme specifically focusing issues related access credit banks micro small medium enterprises msmes category applies commercial banks regional rural banks rrbs cooperative banks covers cases bottleneck lies entirely bank level excludes issues related rbi policy government scheme design credit guarantee mechanisms buyer default rather addresses bank side processing conditions conduct extending credit msmes category includes cases msmes applied loans submitted required documents followed branches digital portals loan application remains pending without formal sanction rejection decision captures administrative stalling prolonged process pending verification status absence deficiency letters timelines repeated demands already submitted documents failure branch offices forward eligible applications regional head offices approval additionally category covers situations loans formally sanctioned disbursement delayed withheld bank ... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
per_device_train_batch_size: 2per_device_eval_batch_size: 2num_train_epochs: 2fp16: Truemulti_dataset_batch_sampler: round_robindo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 2per_device_eval_batch_size: 2gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Truebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
sentence-transformers/all-MiniLM-L6-v2