SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 64 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 64, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'dear sir mam trying register udyam pan error showing udyam registration already done pan registered earlier please guide aadhaar uam pan pan mobile phone clarification existing udyam registration user requesting clarification udyam registration portal indicates registration already done pan although user states registration made',
    'UAM/Udyam Registration/Certificate related issues. category encompasses grievances related identity creation verification eligibility validation micro small medium enterprises msmes udyam registration system udyam registration system serves foundational gateway msmes access central state government schemes bank loans subsidies credit guarantees public procurement benefits statutory advantages scope purpose category covers issues directly impact msme ecosystem including registration related issues udyam portal errors issued udyam registration certificate migration related grievances legacy udyog aadhaar memorandum uam system udyam portal registration related issues registration remains pending failure generate registration number system validation errors despite correct data submission causes backend verification delays pan aadhaar validation errors system downtime incomplete synchronization tax identity databases errors issued udyam registration certificate incorrect inconsistent details including enterprise classification micro small medium gst number address business activity ownership information impact rejection loan applications denial scheme benefits disqualification government tenders migration related grievances failed migration attempts duplicate already registered system errors loss enterprise data inability link historical uam records impact disruption',
    'UAM/Udyam Registration/Certificate related issues. category encompasses grievances related identity creation verification eligibility validation micro small medium enterprises msmes udyam registration system udyam registration system serves foundational gateway msmes access central state government schemes bank loans subsidies credit guarantees public procurement benefits statutory advantages scope purpose category covers issues directly impact msme ecosystem including registration related issues udyam portal errors issued udyam registration certificate migration related grievances legacy udyog aadhaar memorandum uam system udyam portal registration related issues registration remains pending failure generate registration number system validation errors despite correct data submission causes backend verification delays pan aadhaar validation errors system downtime incomplete synchronization tax identity databases errors issued udyam registration certificate incorrect inconsistent details including enterprise classification micro small medium gst number address business activity ownership information impact rejection loan applications denial scheme benefits disqualification government tenders migration related grievances failed migration attempts duplicate already registered system errors loss enterprise data inability link historical uam records impact disruption',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7274, 0.7274],
#         [0.7274, 1.0000, 1.0000],
#         [0.7274, 1.0000, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 98 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 98 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 20 tokens
    • mean: 59.08 tokens
    • max: 64 tokens
    • min: 64 tokens
    • mean: 64.0 tokens
    • max: 64 tokens
  • Samples:
    sentence_0 sentence_1
    availed pmegp subsidy paper carry bag manufacturing unit 2020 iob bank kottayam three years paid loan amount correctly time without failure inspection needed sanction subsidy done yet since keen paying loan bringing burden financially please needful arrange inspection udyam reg kl 07 0005226 thanking non conduct inspection pmegp subsidy msme dfo user reporting despite timely loan repayment inspection required sanctioning pmegp subsidy paper carry bag manufacturing unit conducted causing financial burden requesting assistance earliest possible inspection Related to MSME-DFO. category encompasses grievances related field level execution failures msme development facilitation offices dfos responsible facilitating msme schemes loans subsidies services scope category includes field level execution failures non responsive dfo officers failure provide guidance documentation procedures inaction queries submitted champions physical visits inspection delays inconsistencies postponed repeatedly rescheduled site visits delayed inspection reports unnecessary multiple inspections stall loan disbursement subsidy release local facilitation coordination failures misrouting applications offices lack facilitation land utilities approvals unavailability promised local support services poor coordination dfos banks psus state nodal officers resulting projects remaining stuck despite eligibility prior approvals example issues dfo officials responding phone calls emails regarding subsidy applications guidance provided required documents site inspection msme ...
    unable edit district udhyam certificate please help editing district udyam certificate user requesting assistance edit district udyam certificate UAM/Udyam Registration/Certificate related issues. category encompasses grievances related identity creation verification eligibility validation micro small medium enterprises msmes udyam registration system udyam registration system serves foundational gateway msmes access central state government schemes bank loans subsidies credit guarantees public procurement benefits statutory advantages scope purpose category covers issues directly impact msme ecosystem including registration related issues udyam portal errors issued udyam registration certificate migration related grievances legacy udyog aadhaar memorandum uam system udyam portal registration related issues registration remains pending failure generate registration number system validation errors despite correct data submission causes backend verification delays pan aadhaar validation errors system downtime incomplete synchronization tax identity databases errors issued udyam registration certificate incorrect inconsistent detai...
    loanagreement isbl00910729978dated26 09 2024 loan payment pending since 29 sep 2024 hdfc bank returned cheque stating alteration rbi guidelines pli nodal agencies contact numbers found service unable connect attached loan agreement pdf reference please support get resolution pending since 29 sep 2024 non receipt loan payment dcmsme scheme user reporting non receipt loan payment since 29 09 2024 citing hdfc bank return cheque alteration rbi guidelines requesting assistance resolving issue Related to DCMSME Scheme. category related grievances dcmsme scheme specifically focusing issues related access credit banks micro small medium enterprises msmes category applies commercial banks regional rural banks rrbs cooperative banks covers cases bottleneck lies entirely bank level excludes issues related rbi policy government scheme design credit guarantee mechanisms buyer default rather addresses bank side processing conditions conduct extending credit msmes category includes cases msmes applied loans submitted required documents followed branches digital portals loan application remains pending without formal sanction rejection decision captures administrative stalling prolonged process pending verification status absence deficiency letters timelines repeated demands already submitted documents failure branch offices forward eligible applications regional head offices approval additionally category covers situations loans formally sanctioned disbursement delayed withheld bank ...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • num_train_epochs: 2
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: True
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.3
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
35
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ambika14/sbert-grievance-classifier

Finetuned
(773)
this model

Papers for Ambika14/sbert-grievance-classifier