Matryoshka Representation Learning
Paper • 2205.13147 • Published • 27
How to use justOneMoreTestCase/insurance-rag-embeddings2 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("justOneMoreTestCase/insurance-rag-embeddings2")
sentences = [
"The policy covers insured individuals up to age 80, and children up to 25. The benefits include Hospital Cash, Major Surgical, Day Care, Other Surgical, Ambulance, and Premium Waiver. For the Hospital Cash Benefit, it's activated if hospitalized for more than 24 hours due to injury or sickness.",
"Each of the insured are covered for\nrisks up to age (80). Children are insured up\nHealth\ntoage25years.\n•\nHospitalcashbenefit(HCB)\n•\nMajorSurgicalBenefit(MSB)\n•\nDayCareProcedureBenefit\n•\nOtherSurgicalBenefit\n•\nAmbulanceBenefit\n•\nPremiumwaiverBenefit(PWB)\nA) HospitalCashBenefit:\ndue to\nIf you or any of the insured lives covered under the policy is hospitalised\nAccidental Body Injury or Sickness and the stay in hospital exceeds a connuous\nperiodof24hours,thenforanyconnuousperiodof24hoursorpartthereof,\n1. Benefits offered under the plan are",
"the Applicable Daily Benefit shall be effected on each policy anniversary during the\nCover Period and shall connue unl it a ains a maximum amount of 1.5 mes the\nInial Daily Benefit. Thereaer, this amount in each Policy Year in future shall\nremainatthatmaximumlevela ained.\n\nFurther arithmec addion of an amount equal to “No Claim Benefit” (as\ndescribed in Para 1.G) below) provided the policy a racts and is eligible for it.\nThereshallbeno maximum limitfor such increase which meansthat ifthis policyis\neligible for “No Claim Benefit”, the same shall be granted throughout the Cover\nPeriodwithoutanymaximumlimit.\nFor members\nsubsequently under the policy, the benefit in the first year\nincluded\nshall be equal to Inial Daily Benefit amount and thereaer the Applicable Daily\nBenefitshallincreaseasabove.\nIfanyofthememberinsuredisrequiredtostayinanIntensiveCareUnitofahospital,\nt\nsubject\nbenefit limits and\nwo mes the\nDaily\nwill be payable\nto\nApplicable\nBenefit",
"Benefitshallincreaseasabove.\nIfanyofthememberinsuredisrequiredtostayinanIntensiveCareUnitofahospital,\nt\nsubject\nbenefit limits and\nwo mes the\nDaily\nwill be payable\nto\nApplicable\nBenefit\ncondionsmenonedinPara11A)andexclusionsmenonedinPara15below.\nDuring one period of 24 connuous hours (i.e. one day) of Hospitalisaon (aer\nhaving completed the 24 hours as above), if the said Hospitalisaon included stay\ninanIntensiveCareUnitaswellasinanyotherin-paent(non-IntensiveCareUnit)\nward of the Hospital, the Corporaon shall pay benefits as if the admission was to\nthe Intensive Care Unit provided that the period of Hospitalisaon in the Intensive\nCareUnitwasatleast4connuoushours.\npayable\nor\nNo benefit will be\nfor the first 24 hours of hospitalisaon. However, f\nevery\nthat extends for a connuous period of 7 days or more, the\nHospitalizaon\nDaily Hospital Cash Benefit would also be paid for first 24 hours (day one) of\nhospitalizaon, regardless of whether the Insured was admi ed in a general or"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Okay, I need to create two high-quality, diverse questions based on the given insurance policy context. Let me start by understanding the context thoroughly.',
'Hospitalizaon\nDaily Hospital Cash Benefit would also be paid for first 24 hours (day one) of\nhospitalizaon, regardless of whether the Insured was admi ed in a general or\nspecialwardorinanintensivecareunit.\nB) Major\nBenefit:\nSurgical\nIn the event of an Insured under this plan, due to medical necessity, undergoing\none of the surgeries defined in Major Surgical Benefit Annexure, within the cover\nperiod in a hospital due to Accidental Bodily Injury or Sickness, the respecve\nbenefit percentage of the Major Surgical Benefit Sum Assured, as specified against\neach of the eligible surgeries menoned in Major Surgical Benefit Annexure, shall\nbe paid subject to benefit limits and condions menoned in Para 11B) and\nexclusionsmenonedinPara15below.',
'Benefitshallincreaseasabove.\nIfanyofthememberinsuredisrequiredtostayinanIntensiveCareUnitofahospital,\nt\nsubject\nbenefit limits and\nwo mes the\nDaily\nwill be payable\nto\nApplicable\nBenefit\ncondionsmenonedinPara11A)andexclusionsmenonedinPara15below.\nDuring one period of 24 connuous hours (i.e. one day) of Hospitalisaon (aer\nhaving completed the 24 hours as above), if the said Hospitalisaon included stay\ninanIntensiveCareUnitaswellasinanyotherin-paent(non-IntensiveCareUnit)\nward of the Hospital, the Corporaon shall pay benefits as if the admission was to\nthe Intensive Care Unit provided that the period of Hospitalisaon in the Intensive\nCareUnitwasatleast4connuoushours.\npayable\nor\nNo benefit will be\nfor the first 24 hours of hospitalisaon. However, f\nevery\nthat extends for a connuous period of 7 days or more, the\nHospitalizaon\nDaily Hospital Cash Benefit would also be paid for first 24 hours (day one) of\nhospitalizaon, regardless of whether the Insured was admi ed in a general or',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.3726, 0.2615],
# [0.3726, 1.0000, 0.7728],
# [0.2615, 0.7728, 1.0000]])
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.0338 |
| cosine_accuracy@3 | 0.0473 |
| cosine_accuracy@5 | 0.0676 |
| cosine_accuracy@10 | 0.1419 |
| cosine_precision@1 | 0.0338 |
| cosine_precision@3 | 0.0158 |
| cosine_precision@5 | 0.0135 |
| cosine_precision@10 | 0.0142 |
| cosine_recall@1 | 0.0338 |
| cosine_recall@3 | 0.0473 |
| cosine_recall@5 | 0.0676 |
| cosine_recall@10 | 0.1419 |
| cosine_ndcg@10 | 0.0743 |
| cosine_mrr@10 | 0.0545 |
| cosine_map@100 | 0.0816 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
What happens if a policyholder chooses a lower Initial Daily Benefit (e.g., ₹1,000) but later requires a major surgery costing significantly more than the 100x multiplier of their selected daily benefit? How does the policy’s lump sum benefit structure affect their coverage in this scenario? |
• |
Okay, let's tackle this. The user wants me to generate two high-quality, diverse questions based on the context provided about LIC's Jeevan Arogya. The first question needs to be a direct factual one, and the second a complex scenario-based one. They should not overlap and be challenging. |
LIC's JEEVAN AROGYA (UIN: 512N266V02) |
Okay, let me tackle this. The user wants two high-quality, diverse questions based on the given insurance policy context. First, I need to understand the context thoroughly. |
Each of the insured are covered for |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
384,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1
],
"n_dims_per_step": -1
}
per_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 5multi_dataset_batch_sampler: round_robindo_predict: Falseprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | cosine_ndcg@10 |
|---|---|---|
| 1.0 | 2 | 0.0742 |
| 2.0 | 4 | 0.0742 |
| 3.0 | 6 | 0.0742 |
| 4.0 | 8 | 0.0742 |
| 5.0 | 10 | 0.0743 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
Base model
nreimers/MiniLM-L6-H384-uncased