Matryoshka Representation Learning
Paper • 2205.13147 • Published • 25
This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bsmith3715/legal-ft-demo_final")
# Run inference
sentences = [
'What muscle groups are primarily engaged during the lunge exercise described in the context?',
"hips and we're gonna lunge it down the\nweight is going to feel a little bit\nlight we're working more stabilizers to\nstart here\nstabilizers through the hips ankles\nknees\nall the things okay lunge it down I like\nto put my foot right up against that\nedge\nto help me have a nice grip toes are off\nof the carriage\nand then coming up to squeeze up on a\nstraight standing leg squeeze up through\nthe glute\ndown\nand\nsqueeze and lift good\nthe slower you move here the more work\nyou're going to feel through those quads\nand glutes as well\nslow back\nslow up\nI know sometimes we feel like we want to\nget that heart rate going\nbut sometimes we need this slow movement\nis going to give us even more benefits\nto support us for those fast movements\nlater",
"one\nfind that lengthened position you're\ngoing to lift up slide those shoulder\nblades down the back lift up towards the\nsky big inhale\nand exhale back and away\nleft hand to Center\nturn back towards me and lift I\napologize if you're not on the same side\nwhen you're facing me\nother arm to Center bottom arm all right\nwe're gonna flip around\nto do the same thing on the other side\nso lifting up tall\nplace that hand in front of your\nshoulder we're going up and over long\nspine and then lift shoulders down again\nup and over\nand then you use that oblique to lift up\nexhale lift\nand lengthen\nopen the spine Flex\noblique to come up\nand three\ntwo\nand one\ngood full mermaid now up and over turn\nto face the ground separate those arms",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.45 |
| cosine_accuracy@3 | 0.69 |
| cosine_accuracy@5 | 0.73 |
| cosine_accuracy@10 | 0.86 |
| cosine_precision@1 | 0.45 |
| cosine_precision@3 | 0.23 |
| cosine_precision@5 | 0.146 |
| cosine_precision@10 | 0.086 |
| cosine_recall@1 | 0.45 |
| cosine_recall@3 | 0.69 |
| cosine_recall@5 | 0.73 |
| cosine_recall@10 | 0.86 |
| cosine_ndcg@10 | 0.6469 |
| cosine_mrr@10 | 0.5798 |
| cosine_map@100 | 0.5877 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
What type of spring is the instructor using for the workout? |
hi guys thanks for joining me today we |
What should participants do if the red spring is too heavy for them? |
hi guys thanks for joining me today we |
What is the initial position described for starting the workout on the reformers? |
are going to start first by straddling |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 10per_device_eval_batch_size: 10num_train_epochs: 10multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 10per_device_eval_batch_size: 10per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | cosine_ndcg@10 |
|---|---|---|
| 1.0 | 20 | 0.6130 |
| 2.0 | 40 | 0.6454 |
| 2.5 | 50 | 0.6445 |
| 3.0 | 60 | 0.6498 |
| 4.0 | 80 | 0.6507 |
| 5.0 | 100 | 0.6463 |
| 6.0 | 120 | 0.6433 |
| 7.0 | 140 | 0.6461 |
| 7.5 | 150 | 0.6409 |
| 8.0 | 160 | 0.6417 |
| 9.0 | 180 | 0.6425 |
| 10.0 | 200 | 0.6469 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
Snowflake/snowflake-arctic-embed-m