Matryoshka Representation Learning
Paper • 2205.13147 • Published • 27
How to use ernestobs7/caregiver-ft-v1 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("ernestobs7/caregiver-ft-v1")
sentences = [
"What are some common attitudes and beliefs that can create personal barriers to self-care for family caregivers?",
"Support for nutrition, breathing, and feeding\nPeople with ALS may have trouble chewing and swallowing their food, and getting the nutrients they need. Nutritionists and registered dieticians can help plan small, nutritious meals throughout the day and identify foods to avoid. When the person can no longer eat with help, a feeding tube can reduce the person’s risk of choking and pneumonia.",
"Amyotrophic Lateral Sclerosis (ALS) | National Institute of Neurological Disorders and Stroke\n\n\n\n\n\n\n\n\n Skip to main content\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\nAn official website of the United States government\n\n Here’s how you know\n\n\n\n\n\n\n\n\n\n\n\nOfficial websites use .gov \n A\n .gov\n website belongs to an official government organization in the United States.\n \n\n\n\n\n\n\n\n\nSecure .gov websites use HTTPS\n\n A lock\n (\n\n)\n or\n https://\n means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nSearch\n\n\nMenu\n\n\n\n\n\n\n\n\n\nSearch NINDS\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nSearch NINDS\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nMain navigation",
"Identifying Personal Barriers \nMany times, attitudes and beliefs form personal barriers that stand in the \nway of caring for yourself. Not taking care of yourself may be a lifelong \npattern, with taking care of others an easier option. However, as a family \ncaregiver you must ask yourself, \"What good will I be to the person I care \nfor if I become ill? If I die?\" Breaking old patterns and overcoming \nobstacles is not an easy proposition, but it can be done – regardless of \nyour age or situation. The first task in removing personal barriers to self-\ncare is to identify what is in your way. For example, \n• Do you feel you have to prove that you are worthy of the care recipient's \naffection? \n• Do you think you are being selfish if you put your needs first? \n• Is it frightening to think of your own needs? What is the fear about?"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ernestobs7/caregiver-ft-v1")
# Run inference
sentences = [
'Does having a risk factor guarantee that a person will develop a disorder?',
"A risk factor is a condition or behavior that occurs more frequently in those who have a disease, or who are at greater risk of getting a disease, than in those who don't have the risk factor. Having a risk factor doesn't mean a person will develop a disorder, and not having a risk factor doesn't mean you won’t. Risk factors for ALS include:",
'possible decline in quality of life. \n \nBut despite these risks, family caregivers of any age are less likely than \nnon-caregivers to practice preventive healthcare and self-care behavior. \nRegardless of age, sex, and race and ethnicity, caregivers report problems \nattending to their own health and well-being while managing caregiving \nresponsibilities. They report: \n• sleep deprivation \n• poor eating habits \n• failure to exercise \n• failure to stay in bed when ill \n• postponement of or failure to make medical appointments .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.9167 |
| cosine_accuracy@3 | 1.0 |
| cosine_accuracy@5 | 1.0 |
| cosine_accuracy@10 | 1.0 |
| cosine_precision@1 | 0.9167 |
| cosine_precision@3 | 0.3333 |
| cosine_precision@5 | 0.2 |
| cosine_precision@10 | 0.1 |
| cosine_recall@1 | 0.9167 |
| cosine_recall@3 | 1.0 |
| cosine_recall@5 | 1.0 |
| cosine_recall@10 | 1.0 |
| cosine_ndcg@10 | 0.9638 |
| cosine_mrr@10 | 0.9514 |
| cosine_map@100 | 0.9514 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
What are some common symptoms experienced by individuals with ALS related to muscle function? |
Muscle twitches in the arm, leg, shoulder, or tongue |
How does ALS affect a person's ability to chew and swallow food? |
Muscle twitches in the arm, leg, shoulder, or tongue |
What percentage of ALS cases are classified as familial? |
About 10% of all ALS cases are familial (also called inherited or genetic). Changes in more than a dozen genes have been found to cause familial ALS. |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 10multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | cosine_ndcg@10 |
|---|---|---|
| 1.0 | 10 | 0.9382 |
| 2.0 | 20 | 0.9539 |
| 3.0 | 30 | 0.9484 |
| 4.0 | 40 | 0.9484 |
| 5.0 | 50 | 0.9638 |
| 6.0 | 60 | 0.9638 |
| 7.0 | 70 | 0.9638 |
| 8.0 | 80 | 0.9638 |
| 9.0 | 90 | 0.9638 |
| 10.0 | 100 | 0.9638 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
Snowflake/snowflake-arctic-embed-l