SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Okay, I need to create two high-quality, diverse questions based on the given insurance policy context. Let me start by understanding the context thoroughly.',
    'Hospitalizaon\nDaily Hospital Cash Benefit would also be paid for first 24 hours (day one) of\nhospitalizaon, regardless of whether the Insured was admi ed in a general or\nspecialwardorinanintensivecareunit.\nB) Major\nBenefit:\nSurgical\nIn the event of an Insured under this plan, due to medical necessity, undergoing\none of the surgeries defined in Major Surgical Benefit Annexure, within the cover\nperiod in a hospital due to Accidental Bodily Injury or Sickness, the respecve\nbenefit percentage of the Major Surgical Benefit Sum Assured, as specified against\neach of the eligible surgeries menoned in Major Surgical Benefit Annexure, shall\nbe paid subject to benefit limits and condions menoned in Para 11B) and\nexclusionsmenonedinPara15below.',
    'Benefitshallincreaseasabove.\nIfanyofthememberinsuredisrequiredtostayinanIntensiveCareUnitofahospital,\nt\nsubject\nbenefit limits and\nwo mes the\nDaily\nwill be payable\nto\nApplicable\nBenefit\ncondionsmenonedinPara11A)andexclusionsmenonedinPara15below.\nDuring one period of 24 connuous hours (i.e. one day) of Hospitalisaon (aer\nhaving completed the 24 hours as above), if the said Hospitalisaon included stay\ninanIntensiveCareUnitaswellasinanyotherin-paent(non-IntensiveCareUnit)\nward of the Hospital, the Corporaon shall pay benefits as if the admission was to\nthe Intensive Care Unit provided that the period of Hospitalisaon in the Intensive\nCareUnitwasatleast4connuoushours.\npayable\nor\nNo benefit will be\nfor the first 24 hours of hospitalisaon. However, f\nevery\nthat extends for a connuous period of 7 days or more, the\nHospitalizaon\nDaily Hospital Cash Benefit would also be paid for first 24 hours (day one) of\nhospitalizaon, regardless of whether the Insured was admi ed in a general or',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.3726, 0.2615],
#         [0.3726, 1.0000, 0.7728],
#         [0.2615, 0.7728, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0338
cosine_accuracy@3 0.0473
cosine_accuracy@5 0.0676
cosine_accuracy@10 0.1419
cosine_precision@1 0.0338
cosine_precision@3 0.0158
cosine_precision@5 0.0135
cosine_precision@10 0.0142
cosine_recall@1 0.0338
cosine_recall@3 0.0473
cosine_recall@5 0.0676
cosine_recall@10 0.1419
cosine_ndcg@10 0.0743
cosine_mrr@10 0.0545
cosine_map@100 0.0816

Training Details

Training Dataset

Unnamed Dataset

  • Size: 20 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 20 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 26 tokens
    • mean: 56.7 tokens
    • max: 98 tokens
    • min: 44 tokens
    • mean: 214.1 tokens
    • max: 256 tokens
  • Samples:
    sentence_0 sentence_1
    What happens if a policyholder chooses a lower Initial Daily Benefit (e.g., ₹1,000) but later requires a major surgery costing significantly more than the 100x multiplier of their selected daily benefit? How does the policy’s lump sum benefit structure affect their coverage in this scenario?
    IncreasingHealthcovereveryyear

    Lumpsumbenefitirrespecveofactualmedicalcosts

    Noclaimbenefit

    Flexiblebenefitlimittochoosefrom

    Flexiblepremiumpaymentopons

    Veryeasytochooseyourplan
    Step 1
    2
    Step
    Choose the level of Health cover you need
    Work out the premium payable along with our Representave
    Step 1: Choose the level of Health cover you need:
    You can choose the amount of Inial Daily Benefit (i.e. the daily Hospital Cash Benefit
    applicableinthefirstyearofthepolicy)asperyourneedfromoutofthefollowingchoices:
    1000 per day<br> 2000 per day
    3000 per day<br> 4000 per day
    This is the amount that will be payable to you in the event of hospitalisaon in the first
    year on a per day basis. The Major Surgical Benefit that you will be covered for will be
    100 mes the Inial Daily Benefit you have chosen. Thus the inial Major Surgical
    Benefit Sum Assured will be
    1 lakh, 2 lakh, 3 lakh, 4 lakh respecvely. Other benefits
    `
    such as Day Care Procedure Benefit, Other Surgical Benefit and Premium waiver
    Okay, let's tackle this. The user wants me to generate two high-quality, diverse questions based on the context provided about LIC's Jeevan Arogya. The first question needs to be a direct factual one, and the second a complex scenario-based one. They should not overlap and be challenging. LIC's JEEVAN AROGYA (UIN: 512N266V02)
    (A Non-linked, Non-Parcipang,
    Individual, Health Insurance Plan)
    LIC's Jeevan Arogya is a unique non-parcipang non-linked plan which provides
    health insurance cover against certain specified health risks and provides you with
    mely support in case of medical emergencies and helps you and your family remain
    financiallyindependentindifficultmes.
    Health has been a major concern on everybody's mind, including yours. In these days
    ofskyrockengmedicalexpenses,whenafamilymemberisill,itisatraumacmefor
    the rest of the family. As a caring person, you do not want to let any unfortunate
    incident to affect your plans for you and your family. So why let any medical
    emergenciessha eryourpeaceofmind.
    LIC'sJeevanArogyagivesyou:

    Valuablefinancialproteconincaseofhospitalisaon,surgeryetc

    IncreasingHealthcovereveryyear

    Lumpsumbenefitirrespecveofactualmedicalcosts

    Noclaimbenefit

    Flexiblebenefitlimittochoosefrom

    Flexiblepremiumpaymentopons
    Okay, let me tackle this. The user wants two high-quality, diverse questions based on the given insurance policy context. First, I need to understand the context thoroughly. Each of the insured are covered for
    risks up to age (80). Children are insured up
    Health
    toage25years.

    Hospitalcashbenefit(HCB)

    MajorSurgicalBenefit(MSB)

    DayCareProcedureBenefit

    OtherSurgicalBenefit

    AmbulanceBenefit

    PremiumwaiverBenefit(PWB)
    A) HospitalCashBenefit:
    due to
    If you or any of the insured lives covered under the policy is hospitalised
    Accidental Body Injury or Sickness and the stay in hospital exceeds a connuous
    periodof24hours,thenforanyconnuousperiodof24hoursorpartthereof,
    1. Benefits offered under the plan are
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            384,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 5
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step cosine_ndcg@10
1.0 2 0.0742
2.0 4 0.0742
3.0 6 0.0742
4.0 8 0.0742
5.0 10 0.0743

Training Time

  • Training: 4.9 seconds

Framework Versions

  • Python: 3.12.13
  • Sentence Transformers: 5.4.1
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.8.5
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}
Downloads last month
39
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for justOneMoreTestCase/insurance-rag-embeddings2

Papers for justOneMoreTestCase/insurance-rag-embeddings2

Evaluation results