Matryoshka Representation Learning
Paper
• 2205.13147 • Published
• 25
This is a sentence-transformers model finetuned from nomic-ai/nomic-embed-text-v2-moe on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: NomicBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("tsss1/expressvpn_embeddingmodel")
# Run inference
sentences = [
'Last updated: October 21, 2024\n\nThis guide is for users who are having issues streaming Max (formerly HBO Max) while connected to the VPN.\n\nTo comply with the Max Terms of Use and ExpressVPN Terms of Service, you should connect to a server location that matches the country where you are currently located.\n\nJump to…\n\n1. Change to a different VPN server location\n2. Sign out of the Max app, then sign in again\n3. Watch Max using your browser\n4. Contact ExpressVPN Support\n\n1. Change to a different VPN server location\n\nIf you are a U.S. user having issues streaming Max, try changing to these VPN server locations in the following order:\n\nUSA – San Francisco\nUSA – Washington DC\nUSA – New York\nUSA – Los Angeles – 1\n\nBelow are instructions for changing your VPN server location on:\n\nWindows\nMac\niOS\nAndroid\nAndroid TV\nApple TV\nLinux\nRouters\nIf you are streaming via the Max app, you should force-close it and reopen it each time you change location. Below are instructions for force-closing an app on:iOS: Swipe up from the bottom of the homescreen, keeping your finger pressed until app previews appear at left. Swipe to find the Max app preview, then swipe up to close the app.\n\nAndroid: On your Android device, open your multitasking interface. The way to do this varies depending on your device:\n\nIf your device has three icons at the bottom of the screen, tap either the three vertical lines icon or the square icon.\nIf your device features a single horizontal line at the bottom of the screen, swipe up from the bottom to the middle of the screen, hold for a second, then release.\n\nNext, swipe to find the Max app preview, then swipe to force-close the app. The direction you need to swipe will vary depending on your device.\n\nAndroid TV: Go to Settings, select Apps, and scroll to find the Max app. Select the app, then select Force Stop.\n\nFire TV/Fire Stick: Go to Settings, select Applications, select Manage Installed Applications. Scroll to find the Max app. Select the app, then select Force Stop.\n\nApple TV: Double-click the TV icon on your remote to see the apps currently running. Swipe to find the Max app preview, then swipe up to close the app.\n\nIf you are a non-U.S. user having issues streaming Max, proceed to the next step.\n\nNeed help?\xa0Contact the ExpressVPN Support Team for immediate assistance.\n\nBack to top\n\n2. Sign out of the Max app, then sign in again\n\nIf you are using the Max app, sign out of it, restart your device, and then sign back in.\n\nNeed help?\xa0Contact the ExpressVPN Support Team for immediate assistance.\n\nBack to top\n\n3. Watch Max on your browser\n\nTry streaming Max via your browser by going to https://www.max.com/login and signing in with your Max account details.\n\nIf you are having issues streaming Max from your browser while connected to the VPN:\n\nGet the ExpressVPN browser extension (available for Windows, Mac, and Linux). To use the browser extension, you must also have the ExpressVPN app installed on your computer.\nU.S. users should try connecting to these server locations in the following order:\nUSA – San Francisco\nUSA – Washington DC\nUSA – New York\nUSA – Los Angeles – 1\n\nNon-U.S. users should proceed to the next step.\n\nTry using a different browser. The ExpressVPN browser extension is available on Windows, Mac, and Linux, and it works with Chrome, Firefox, Vivaldi, Chromium, Brave, and Microsoft Edge. The ExpressVPN app must also be installed.\n\nNeed help?\xa0Contact the ExpressVPN Support Team for immediate assistance.\n\nBack to top\n\n4. Contact Support\n\nIf you are still unable to stream Max while connected to the VPN, contact the ExpressVPN Support Team.\n\nBack to top\n\nExpressVPN is optimized to work with Max so you can enjoy online privacy and security all the time, without the VPN interfering. It should never be used as a means of copyright circumvention, which is strictly against our Terms of Service. As we cannot see or control what you do while connected to our VPN, you are responsible at all times for complying with our terms, the Max Terms of Use, and any applicable laws. Compliance requires you to be located in the U.S. while streaming Max with ExpressVPN.\nWas this article helpful?\nYes No',
'Troubleshooting steps for streaming Max',
'I can help you with various questions and issues related to ExpressVPN. What do you need assistance with: \n\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
dim_768, dim_512, dim_256, dim_128 and dim_64InformationRetrievalEvaluator| Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
|---|---|---|---|---|---|
| cosine_accuracy@1 | 0.6 | 0.6 | 0.625 | 0.625 | 0.625 |
| cosine_accuracy@3 | 0.775 | 0.775 | 0.775 | 0.725 | 0.7 |
| cosine_accuracy@5 | 0.825 | 0.875 | 0.85 | 0.775 | 0.75 |
| cosine_accuracy@10 | 0.875 | 0.9 | 0.875 | 0.875 | 0.8 |
| cosine_precision@1 | 0.6 | 0.6 | 0.625 | 0.625 | 0.625 |
| cosine_precision@3 | 0.2667 | 0.2667 | 0.2667 | 0.25 | 0.2417 |
| cosine_precision@5 | 0.17 | 0.18 | 0.175 | 0.16 | 0.155 |
| cosine_precision@10 | 0.09 | 0.0925 | 0.09 | 0.09 | 0.0825 |
| cosine_recall@1 | 0.5875 | 0.5875 | 0.6125 | 0.6125 | 0.6125 |
| cosine_recall@3 | 0.775 | 0.775 | 0.775 | 0.725 | 0.7 |
| cosine_recall@5 | 0.825 | 0.875 | 0.85 | 0.775 | 0.75 |
| cosine_recall@10 | 0.875 | 0.9 | 0.875 | 0.875 | 0.8 |
| cosine_ndcg@10 | 0.7465 | 0.7552 | 0.751 | 0.7407 | 0.7078 |
| cosine_mrr@10 | 0.7042 | 0.7083 | 0.7108 | 0.6992 | 0.6793 |
| cosine_map@100 | 0.7103 | 0.7124 | 0.7161 | 0.7051 | 0.6907 |
positive and anchor| positive | anchor | |
|---|---|---|
| type | string | string |
| details |
|
|
| positive | anchor |
|---|---|
I'd like to discuss common issues that users face when using ExpressVPN. |
I'd be happy to help with any questions or concerns you have about ExpressVPN. What would you like to know or discuss? |
I'd like to provide information about ExpressVPN, but I think it would be more helpful to get some assistance from you. |
I can help you with any question you have about ExpressVPN. What is it that you need help with? |
Last updated: January 11, 2023 |
ExpressVPN iOS free trial or subscription expiring |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: epochper_device_train_batch_size: 2per_device_eval_batch_size: 2gradient_accumulation_steps: 4learning_rate: 2e-05num_train_epochs: 7lr_scheduler_type: cosinewarmup_ratio: 0.1bf16: Truetf32: Falseload_best_model_at_end: Trueoptim: adamw_torch_fusedbatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 2per_device_eval_batch_size: 2per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 4eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 7max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Falselocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|---|---|---|---|---|---|---|---|
| 0.2235 | 10 | 2.9921 | - | - | - | - | - |
| 0.4469 | 20 | 0.9824 | - | - | - | - | - |
| 0.6704 | 30 | 0.6762 | - | - | - | - | - |
| 0.8939 | 40 | 0.0133 | - | - | - | - | - |
| 0.9832 | 44 | - | 0.7669 | 0.7701 | - | - | - |
| 0.2235 | 10 | 0.0179 | - | - | - | - | - |
| 0.4469 | 20 | 0.2714 | - | - | - | - | - |
| 0.6704 | 30 | 0.0104 | - | - | - | - | - |
| 0.8939 | 40 | 0.0015 | - | - | - | - | - |
| 0.9832 | 44 | - | 0.7442 | 0.7594 | 0.7465 | 0.7149 | 0.7046 |
| 1.1341 | 50 | 0.2207 | - | - | - | - | - |
| 1.3575 | 60 | 0.48 | - | - | - | - | - |
| 1.5810 | 70 | 0.003 | - | - | - | - | - |
| 1.8045 | 80 | 0.2985 | - | - | - | - | - |
| 1.9832 | 88 | - | 0.7751 | 0.774 | 0.7821 | 0.7746 | 0.7365 |
| 2.0447 | 90 | 0.0168 | - | - | - | - | - |
| 2.2682 | 100 | 0.0698 | - | - | - | - | - |
| 2.4916 | 110 | 0.0054 | - | - | - | - | - |
| 2.7151 | 120 | 0.0112 | - | - | - | - | - |
| 2.9385 | 130 | 0.0031 | - | - | - | - | - |
| 2.9832 | 132 | - | 0.7569 | 0.7537 | 0.7565 | 0.7588 | 0.7251 |
| 3.1788 | 140 | 0.1794 | - | - | - | - | - |
| 3.4022 | 150 | 0.3266 | - | - | - | - | - |
| 3.6257 | 160 | 0.0006 | - | - | - | - | - |
| 3.8492 | 170 | 0.0003 | - | - | - | - | - |
| 3.9832 | 176 | - | 0.7491 | 0.7613 | 0.7526 | 0.7513 | 0.7206 |
| 4.0894 | 180 | 0.2622 | - | - | - | - | - |
| 4.3128 | 190 | 0.0004 | - | - | - | - | - |
| 4.5363 | 200 | 0.0392 | - | - | - | - | - |
| 4.7598 | 210 | 0.3312 | - | - | - | - | - |
| 4.9832 | 220 | 0.0021 | 0.7548 | 0.7527 | 0.7466 | 0.7568 | 0.7101 |
| 5.2235 | 230 | 0.7593 | - | - | - | - | - |
| 5.4469 | 240 | 0.0004 | - | - | - | - | - |
| 5.6704 | 250 | 0.0003 | - | - | - | - | - |
| 5.8939 | 260 | 0.0154 | - | - | - | - | - |
| 5.9832 | 264 | - | 0.7498 | 0.7545 | 0.7510 | 0.7407 | 0.7147 |
| 6.1341 | 270 | 0.0162 | - | - | - | - | - |
| 6.3575 | 280 | 0.447 | - | - | - | - | - |
| 6.5810 | 290 | 0.001 | - | - | - | - | - |
| 6.8045 | 300 | 0.1628 | - | - | - | - | - |
| 6.9832 | 308 | - | 0.7465 | 0.7552 | 0.7510 | 0.7407 | 0.7078 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
FacebookAI/xlm-roberta-base