Matryoshka Representation Learning
Paper • 2205.13147 • Published • 27
How to use justOneMoreTestCase/insurance-rag-embeddings with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("justOneMoreTestCase/insurance-rag-embeddings", trust_remote_code=True)
sentences = [
"How can I contact my LIC agent or nearest branch according to the provided instructions?",
"Contact your LIC agent or nearest branch or\nvisit our website\nor\nwww.licindia.in\nSMS\nto\n, (e.g. Mumbai.’)\n‘YOUR CITY NAME’\n566773",
"LIC's JEEVAN AROGYA (UIN: 512N266V02)\n(A Non-linked, Non-Parcipang,\nIndividual, Health Insurance Plan)\nLIC's Jeevan Arogya is a unique non-parcipang non-linked plan which provides\nhealth insurance cover against certain specified health risks and provides you with\nmely support in case of medical emergencies and helps you and your family remain\nfinanciallyindependentindifficultmes.\nHealth has been a major concern on everybody's mind, including yours. In these days\nofskyrockengmedicalexpenses,whenafamilymemberisill,itisatraumacmefor\nthe rest of the family. As a caring person, you do not want to let any unfortunate\nincident to affect your plans for you and your family. So why let any medical\nemergenciessha eryourpeaceofmind.",
"Contact your LIC agent or nearest branch or\nvisit our website\nor\nwww.licindia.in\nSMS\nto\n, (e.g. Mumbai.’)\n‘YOUR CITY NAME’\n566773"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from nomic-ai/nomic-embed-text-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'NomicBertModel'})
(1): Pooling({'embedding_dimension': 768, 'pooling_mode': 'mean', 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'How is the Initial Daily Benefit (the Applicable Daily Benefit for the first policy year) determined and stated in the policy schedule?',
'provided any such part\nexceeds a connuous period of 4 hours (aer having\nstay\ncompleted the 24 hours as above) in a non-ICU ward/room of a hospital, an\namount equal to the Applicable Daily Benefit (ADB) available under the policy\nduring that policy year shall be payable subject to benefit limits and condions\nmenonedinPara11A)andexclusionsmenonedinPara15below.\nDuring the first\nof cover commencement in respect of each insured, the\nyear\nApplicableDailyBenefitshallbetheInialDailyBenefitamountchosenbyyouand\nmenonedinthepolicySchedule.\nTheamountof DBforeachpolicyyear,aerthefirstpolicyyear,shallconsistof2parts:\nA\n\nAn arithmec addion of an amount equal to 5% (five percent) of the Inial Daily',
'Periodwithoutanymaximumlimit.\nFor members\nsubsequently under the policy, the benefit in the first year\nincluded\nshall be equal to Inial Daily Benefit amount and thereaer the Applicable Daily\nBenefitshallincreaseasabove.\nIfanyofthememberinsuredisrequiredtostayinanIntensiveCareUnitofahospital,\nt\nsubject\nbenefit limits and\nwo mes the\nDaily\nwill be payable\nto\nApplicable\nBenefit\ncondionsmenonedinPara11A)andexclusionsmenonedinPara15below.\nDuring one period of 24 connuous hours (i.e. one day) of Hospitalisaon (aer\nhaving completed the 24 hours as above), if the said Hospitalisaon included stay\ninanIntensiveCareUnitaswellasinanyotherin-paent(non-IntensiveCareUnit)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6203, 0.6283],
# [0.6203, 1.0000, 0.8679],
# [0.6283, 0.8679, 1.0000]])
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.5455 |
| cosine_accuracy@3 | 0.7727 |
| cosine_accuracy@5 | 0.9091 |
| cosine_accuracy@10 | 1.0 |
| cosine_precision@1 | 0.5455 |
| cosine_precision@3 | 0.2576 |
| cosine_precision@5 | 0.1818 |
| cosine_precision@10 | 0.1 |
| cosine_recall@1 | 0.5455 |
| cosine_recall@3 | 0.7727 |
| cosine_recall@5 | 0.9091 |
| cosine_recall@10 | 1.0 |
| cosine_ndcg@10 | 0.7731 |
| cosine_mrr@10 | 0.7011 |
| cosine_map@100 | 0.7011 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Which specific benefits (e.g., Hospital Cash Benefit, Major Surgical Benefit, Day Care Procedure Benefit, etc.) are available to the insured if they are hospitalized for a continuous period of 24 hours or more? |
65 years (last birthday) |
What are the four daily Hospital Cash Benefit options available when choosing the initial Daily Benefit for the LIC Jeevan Arogya policy? |
emergenciessha eryourpeaceofmind. |
If a policyholder selects a daily Hospital Cash Benefit of 3000 per day, what will be the Initial Major Surgical Benefit sum assured? |
|
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
per_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 5multi_dataset_batch_sampler: round_robindo_predict: Falseprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | cosine_ndcg@10 |
|---|---|---|
| 1.0 | 2 | 0.7731 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
Base model
nomic-ai/nomic-embed-text-v1.5