Matryoshka Representation Learning
Paper • 2205.13147 • Published • 25
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'ma-a 12 GÍN i-dí-sú-in KI tù-ra-a i-lá-qé 5',
'Indeed, Iddin-Suen will receive 12 shekels from Turaya.',
'ina pa-ni-szu-nu i-za#-mu-ru ina _ugu_ szA ni-iq-bu-ni ma-a [x x x x x x] ki',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.6474, -0.1447],
# [ 0.6474, 1.0000, -0.0218],
# [-0.1447, -0.0218, 1.0000]])
akkadian-irInformationRetrievalEvaluator with these parameters:{
"truncate_dim": 384
}
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.5072 |
| cosine_accuracy@3 | 0.7332 |
| cosine_accuracy@5 | 0.7707 |
| cosine_accuracy@10 | 0.8068 |
| cosine_precision@1 | 0.5072 |
| cosine_precision@3 | 0.2444 |
| cosine_precision@5 | 0.1541 |
| cosine_precision@10 | 0.0807 |
| cosine_recall@1 | 0.5072 |
| cosine_recall@3 | 0.7332 |
| cosine_recall@5 | 0.7707 |
| cosine_recall@10 | 0.8068 |
| cosine_ndcg@10 | 0.6698 |
| cosine_mrr@10 | 0.6246 |
| cosine_map@100 | 0.6276 |
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
kà-sí-im (kāsu) |
m. & f.; pl. f. "cup, bowl" [GAL; MB on (DUG.)GÚ.ZI] freq. of metal; for oil, wine; MB, NA as measure of capacity |
If in Nisannu Month I, for his house .... |
SIG5 ša-bu-ra-am i-ṣé-er i-dí-a-šur DUMU dan-a-šur PUZUR4.IŠTAR a-hi-šu ù i-ku-pí-a DUMU a-šur-i-mì-tí iš-ma-a-šur i-šu |
Idī-Aššur s. Dān-Aššur, brother of Puzur-Ištar, and Ikuppiya s. Aššur-imitt owe Išme-Aššur 14 talents broken refined copper. |
DUMU {1}-ba-da-a.a |
mì-šu ṣú-ha-ru-ša ša-lim-a-šur ù a-li-ku a-dí šé-ni-šu i-li-ku-ni-ma té-er-ta-ak-nu-ma lá i-li-kà-ni |
Why is it that Šalim-Aššur's servants and other travellers have come here twice, but no message from you has arrived? |
servant Ina-šar-Bel-allak |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
384,
256,
128
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 32learning_rate: 2e-05lr_scheduler_type: cosinewarmup_ratio: 0.1fp16: Trueload_best_model_at_end: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: Nonewarmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | akkadian-ir_cosine_ndcg@10 |
|---|---|---|---|
| 0.0043 | 50 | 24.2917 | - |
| 0.0086 | 100 | 22.3808 | - |
| 0.0129 | 150 | 19.4952 | - |
| 0.0172 | 200 | 16.7314 | - |
| 0.0215 | 250 | 14.4493 | - |
| 0.0258 | 300 | 12.8579 | - |
| 0.0302 | 350 | 11.6765 | - |
| 0.0345 | 400 | 11.056 | - |
| 0.0388 | 450 | 10.3627 | - |
| 0.0431 | 500 | 9.5568 | - |
| 0.0474 | 550 | 9.1752 | - |
| 0.0517 | 600 | 8.7544 | - |
| 0.0560 | 650 | 8.7637 | - |
| 0.0603 | 700 | 8.3496 | - |
| 0.0646 | 750 | 8.0293 | - |
| 0.0689 | 800 | 7.5629 | - |
| 0.0732 | 850 | 7.682 | - |
| 0.0775 | 900 | 7.2793 | - |
| 0.0819 | 950 | 7.2354 | - |
| 0.0862 | 1000 | 7.0245 | - |
| 0.0905 | 1050 | 6.7 | - |
| 0.0948 | 1100 | 6.8599 | - |
| 0.0991 | 1150 | 6.1292 | - |
| 0.0999 | 1160 | - | 0.1951 |
| 0.1034 | 1200 | 6.0634 | - |
| 0.1077 | 1250 | 5.9424 | - |
| 0.1120 | 1300 | 6.3258 | - |
| 0.1163 | 1350 | 5.8804 | - |
| 0.1206 | 1400 | 5.9022 | - |
| 0.1249 | 1450 | 5.7101 | - |
| 0.1292 | 1500 | 5.6781 | - |
| 0.1336 | 1550 | 5.603 | - |
| 0.1379 | 1600 | 5.4788 | - |
| 0.1422 | 1650 | 5.5066 | - |
| 0.1465 | 1700 | 5.6286 | - |
| 0.1508 | 1750 | 5.2864 | - |
| 0.1551 | 1800 | 5.399 | - |
| 0.1594 | 1850 | 5.1604 | - |
| 0.1637 | 1900 | 5.264 | - |
| 0.1680 | 1950 | 5.3405 | - |
| 0.1723 | 2000 | 5.0218 | - |
| 0.1766 | 2050 | 5.2333 | - |
| 0.1809 | 2100 | 4.9349 | - |
| 0.1852 | 2150 | 4.6845 | - |
| 0.1896 | 2200 | 5.1475 | - |
| 0.1939 | 2250 | 4.5305 | - |
| 0.1982 | 2300 | 4.6658 | - |
| 0.1999 | 2320 | - | 0.2463 |
| 0.2025 | 2350 | 4.618 | - |
| 0.2068 | 2400 | 4.6903 | - |
| 0.2111 | 2450 | 4.4551 | - |
| 0.2154 | 2500 | 4.7722 | - |
| 0.2197 | 2550 | 4.3916 | - |
| 0.2240 | 2600 | 4.1282 | - |
| 0.2283 | 2650 | 4.3277 | - |
| 0.2326 | 2700 | 4.6267 | - |
| 0.2369 | 2750 | 4.3596 | - |
| 0.2413 | 2800 | 4.3899 | - |
| 0.2456 | 2850 | 4.2 | - |
| 0.2499 | 2900 | 4.1903 | - |
| 0.2542 | 2950 | 4.1434 | - |
| 0.2585 | 3000 | 4.2724 | - |
| 0.2628 | 3050 | 4.244 | - |
| 0.2671 | 3100 | 4.1991 | - |
| 0.2714 | 3150 | 4.0842 | - |
| 0.2757 | 3200 | 3.9193 | - |
| 0.2800 | 3250 | 3.8654 | - |
| 0.2843 | 3300 | 3.9076 | - |
| 0.2886 | 3350 | 3.4862 | - |
| 0.2930 | 3400 | 3.7306 | - |
| 0.2973 | 3450 | 3.8205 | - |
| 0.2998 | 3480 | - | 0.2908 |
| 0.3016 | 3500 | 4.0037 | - |
| 0.3059 | 3550 | 3.5835 | - |
| 0.3102 | 3600 | 3.7554 | - |
| 0.3145 | 3650 | 3.4443 | - |
| 0.3188 | 3700 | 3.8453 | - |
| 0.3231 | 3750 | 3.5481 | - |
| 0.3274 | 3800 | 3.6546 | - |
| 0.3317 | 3850 | 3.4082 | - |
| 0.3360 | 3900 | 3.2601 | - |
| 0.3403 | 3950 | 3.5107 | - |
| 0.3446 | 4000 | 3.1638 | - |
| 0.3490 | 4050 | 3.3906 | - |
| 0.3533 | 4100 | 3.5139 | - |
| 0.3576 | 4150 | 3.2548 | - |
| 0.3619 | 4200 | 3.392 | - |
| 0.3662 | 4250 | 3.292 | - |
| 0.3705 | 4300 | 3.0331 | - |
| 0.3748 | 4350 | 2.8747 | - |
| 0.3791 | 4400 | 3.193 | - |
| 0.3834 | 4450 | 3.1662 | - |
| 0.3877 | 4500 | 2.9548 | - |
| 0.3920 | 4550 | 3.1211 | - |
| 0.3963 | 4600 | 2.9486 | - |
| 0.3998 | 4640 | - | 0.3222 |
| 0.4007 | 4650 | 3.0281 | - |
| 0.4050 | 4700 | 2.9552 | - |
| 0.4093 | 4750 | 2.6024 | - |
| 0.4136 | 4800 | 2.8493 | - |
| 0.4179 | 4850 | 2.7818 | - |
| 0.4222 | 4900 | 2.8218 | - |
| 0.4265 | 4950 | 2.5303 | - |
| 0.4308 | 5000 | 2.5312 | - |
| 0.4351 | 5050 | 2.8386 | - |
| 0.4394 | 5100 | 2.6784 | - |
| 0.4437 | 5150 | 2.7933 | - |
| 0.4480 | 5200 | 2.6402 | - |
| 0.4524 | 5250 | 2.7994 | - |
| 0.4567 | 5300 | 2.8292 | - |
| 0.4610 | 5350 | 2.6279 | - |
| 0.4653 | 5400 | 2.4097 | - |
| 0.4696 | 5450 | 2.7501 | - |
| 0.4739 | 5500 | 2.3796 | - |
| 0.4782 | 5550 | 2.6051 | - |
| 0.4825 | 5600 | 2.6986 | - |
| 0.4868 | 5650 | 2.4088 | - |
| 0.4911 | 5700 | 2.5498 | - |
| 0.4954 | 5750 | 2.3827 | - |
| 0.4997 | 5800 | 2.5159 | 0.3500 |
| 0.5040 | 5850 | 2.4432 | - |
| 0.5084 | 5900 | 2.1923 | - |
| 0.5127 | 5950 | 2.4678 | - |
| 0.5170 | 6000 | 2.228 | - |
| 0.5213 | 6050 | 2.2555 | - |
| 0.5256 | 6100 | 2.4221 | - |
| 0.5299 | 6150 | 2.3692 | - |
| 0.5342 | 6200 | 2.5304 | - |
| 0.5385 | 6250 | 2.2569 | - |
| 0.5428 | 6300 | 2.0883 | - |
| 0.5471 | 6350 | 2.2691 | - |
| 0.5514 | 6400 | 2.2558 | - |
| 0.5557 | 6450 | 2.2126 | - |
| 0.5601 | 6500 | 2.1121 | - |
| 0.5644 | 6550 | 2.12 | - |
| 0.5687 | 6600 | 2.2115 | - |
| 0.5730 | 6650 | 1.9303 | - |
| 0.5773 | 6700 | 1.9711 | - |
| 0.5816 | 6750 | 2.1382 | - |
| 0.5859 | 6800 | 1.9612 | - |
| 0.5902 | 6850 | 1.9234 | - |
| 0.5945 | 6900 | 2.1105 | - |
| 0.5988 | 6950 | 1.9214 | - |
| 0.5997 | 6960 | - | 0.3794 |
| 0.6031 | 7000 | 1.8454 | - |
| 0.6074 | 7050 | 2.127 | - |
| 0.6118 | 7100 | 2.0367 | - |
| 0.6161 | 7150 | 2.0193 | - |
| 0.6204 | 7200 | 1.8004 | - |
| 0.6247 | 7250 | 2.0138 | - |
| 0.6290 | 7300 | 1.789 | - |
| 0.6333 | 7350 | 1.9486 | - |
| 0.6376 | 7400 | 1.9889 | - |
| 0.6419 | 7450 | 2.0563 | - |
| 0.6462 | 7500 | 1.9492 | - |
| 0.6505 | 7550 | 1.8981 | - |
| 0.6548 | 7600 | 1.8442 | - |
| 0.6591 | 7650 | 1.852 | - |
| 0.6634 | 7700 | 1.7902 | - |
| 0.6678 | 7750 | 1.6871 | - |
| 0.6721 | 7800 | 1.698 | - |
| 0.6764 | 7850 | 1.5765 | - |
| 0.6807 | 7900 | 1.8773 | - |
| 0.6850 | 7950 | 1.7695 | - |
| 0.6893 | 8000 | 1.621 | - |
| 0.6936 | 8050 | 1.492 | - |
| 0.6979 | 8100 | 1.6412 | - |
| 0.6996 | 8120 | - | 0.4046 |
| 0.7022 | 8150 | 1.7606 | - |
| 0.7065 | 8200 | 1.5547 | - |
| 0.7108 | 8250 | 1.7866 | - |
| 0.7151 | 8300 | 1.531 | - |
| 0.7195 | 8350 | 1.7266 | - |
| 0.7238 | 8400 | 1.4949 | - |
| 0.7281 | 8450 | 1.9541 | - |
| 0.7324 | 8500 | 1.6818 | - |
| 0.7367 | 8550 | 1.4678 | - |
| 0.7410 | 8600 | 1.8328 | - |
| 0.7453 | 8650 | 1.5184 | - |
| 0.7496 | 8700 | 1.6247 | - |
| 0.7539 | 8750 | 1.5787 | - |
| 0.7582 | 8800 | 1.6704 | - |
| 0.7625 | 8850 | 1.5755 | - |
| 0.7668 | 8900 | 1.6273 | - |
| 0.7712 | 8950 | 1.614 | - |
| 0.7755 | 9000 | 1.5335 | - |
| 0.7798 | 9050 | 1.461 | - |
| 0.7841 | 9100 | 1.5011 | - |
| 0.7884 | 9150 | 1.6853 | - |
| 0.7927 | 9200 | 1.4713 | - |
| 0.7970 | 9250 | 1.504 | - |
| 0.7996 | 9280 | - | 0.4436 |
| 0.8013 | 9300 | 1.5662 | - |
| 0.8056 | 9350 | 1.3562 | - |
| 0.8099 | 9400 | 1.4698 | - |
| 0.8142 | 9450 | 1.5387 | - |
| 0.8185 | 9500 | 1.3739 | - |
| 0.8229 | 9550 | 1.4344 | - |
| 0.8272 | 9600 | 1.5813 | - |
| 0.8315 | 9650 | 1.5476 | - |
| 0.8358 | 9700 | 1.4192 | - |
| 0.8401 | 9750 | 1.5959 | - |
| 0.8444 | 9800 | 1.463 | - |
| 0.8487 | 9850 | 1.5049 | - |
| 0.8530 | 9900 | 1.5464 | - |
| 0.8573 | 9950 | 1.5782 | - |
| 0.8616 | 10000 | 1.4452 | - |
| 0.8659 | 10050 | 1.3905 | - |
| 0.8702 | 10100 | 1.5898 | - |
| 0.8745 | 10150 | 1.3744 | - |
| 0.8789 | 10200 | 1.2622 | - |
| 0.8832 | 10250 | 1.1547 | - |
| 0.8875 | 10300 | 1.3283 | - |
| 0.8918 | 10350 | 1.4365 | - |
| 0.8961 | 10400 | 1.5452 | - |
| 0.8995 | 10440 | - | 0.4659 |
| 0.9004 | 10450 | 1.3644 | - |
| 0.9047 | 10500 | 1.4959 | - |
| 0.9090 | 10550 | 1.4951 | - |
| 0.9133 | 10600 | 1.3366 | - |
| 0.9176 | 10650 | 1.5537 | - |
| 0.9219 | 10700 | 1.2168 | - |
| 0.9262 | 10750 | 1.2671 | - |
| 0.9306 | 10800 | 1.2388 | - |
| 0.9349 | 10850 | 1.4667 | - |
| 0.9392 | 10900 | 1.2911 | - |
| 0.9435 | 10950 | 1.2547 | - |
| 0.9478 | 11000 | 1.4643 | - |
| 0.9521 | 11050 | 1.4337 | - |
| 0.9564 | 11100 | 1.2031 | - |
| 0.9607 | 11150 | 1.3594 | - |
| 0.9650 | 11200 | 1.3133 | - |
| 0.9693 | 11250 | 1.2628 | - |
| 0.9736 | 11300 | 1.116 | - |
| 0.9779 | 11350 | 1.2652 | - |
| 0.9823 | 11400 | 1.2119 | - |
| 0.9866 | 11450 | 1.1888 | - |
| 0.9909 | 11500 | 1.2845 | - |
| 0.9952 | 11550 | 1.399 | - |
| 0.9995 | 11600 | 1.0896 | 0.4985 |
| 1.0038 | 11650 | 1.1697 | - |
| 1.0081 | 11700 | 1.189 | - |
| 1.0124 | 11750 | 1.2988 | - |
| 1.0167 | 11800 | 1.178 | - |
| 1.0210 | 11850 | 1.4166 | - |
| 1.0253 | 11900 | 1.1385 | - |
| 1.0296 | 11950 | 1.1459 | - |
| 1.0339 | 12000 | 1.2123 | - |
| 1.0383 | 12050 | 1.0782 | - |
| 1.0426 | 12100 | 1.2136 | - |
| 1.0469 | 12150 | 1.2298 | - |
| 1.0512 | 12200 | 1.2266 | - |
| 1.0555 | 12250 | 1.1184 | - |
| 1.0598 | 12300 | 1.1255 | - |
| 1.0641 | 12350 | 1.2786 | - |
| 1.0684 | 12400 | 1.2258 | - |
| 1.0727 | 12450 | 1.2677 | - |
| 1.0770 | 12500 | 1.1009 | - |
| 1.0813 | 12550 | 1.3069 | - |
| 1.0856 | 12600 | 1.1574 | - |
| 1.0900 | 12650 | 1.232 | - |
| 1.0943 | 12700 | 1.3349 | - |
| 1.0986 | 12750 | 1.0868 | - |
| 1.0994 | 12760 | - | 0.5223 |
| 1.1029 | 12800 | 1.1968 | - |
| 1.1072 | 12850 | 1.1317 | - |
| 1.1115 | 12900 | 1.0791 | - |
| 1.1158 | 12950 | 1.1399 | - |
| 1.1201 | 13000 | 1.1907 | - |
| 1.1244 | 13050 | 1.322 | - |
| 1.1287 | 13100 | 1.2167 | - |
| 1.1330 | 13150 | 1.1696 | - |
| 1.1373 | 13200 | 1.2748 | - |
| 1.1417 | 13250 | 1.2751 | - |
| 1.1460 | 13300 | 1.2965 | - |
| 1.1503 | 13350 | 1.1097 | - |
| 1.1546 | 13400 | 1.3141 | - |
| 1.1589 | 13450 | 1.2249 | - |
| 1.1632 | 13500 | 1.4477 | - |
| 1.1675 | 13550 | 1.1688 | - |
| 1.1718 | 13600 | 1.2521 | - |
| 1.1761 | 13650 | 1.0834 | - |
| 1.1804 | 13700 | 1.2089 | - |
| 1.1847 | 13750 | 1.0982 | - |
| 1.1890 | 13800 | 1.2871 | - |
| 1.1933 | 13850 | 1.053 | - |
| 1.1977 | 13900 | 1.1601 | - |
| 1.1994 | 13920 | - | 0.5383 |
| 1.2020 | 13950 | 1.2559 | - |
| 1.2063 | 14000 | 1.076 | - |
| 1.2106 | 14050 | 1.2375 | - |
| 1.2149 | 14100 | 1.1363 | - |
| 1.2192 | 14150 | 1.1253 | - |
| 1.2235 | 14200 | 1.0961 | - |
| 1.2278 | 14250 | 1.1226 | - |
| 1.2321 | 14300 | 1.0251 | - |
| 1.2364 | 14350 | 1.087 | - |
| 1.2407 | 14400 | 1.1262 | - |
| 1.2450 | 14450 | 1.2847 | - |
| 1.2494 | 14500 | 1.1392 | - |
| 1.2537 | 14550 | 1.2119 | - |
| 1.2580 | 14600 | 1.0831 | - |
| 1.2623 | 14650 | 1.1392 | - |
| 1.2666 | 14700 | 1.2348 | - |
| 1.2709 | 14750 | 1.1431 | - |
| 1.2752 | 14800 | 1.1248 | - |
| 1.2795 | 14850 | 1.1533 | - |
| 1.2838 | 14900 | 1.134 | - |
| 1.2881 | 14950 | 1.1922 | - |
| 1.2924 | 15000 | 1.2331 | - |
| 1.2967 | 15050 | 1.1185 | - |
| 1.2993 | 15080 | - | 0.5594 |
| 1.3011 | 15100 | 1.3496 | - |
| 1.3054 | 15150 | 1.0629 | - |
| 1.3097 | 15200 | 1.2785 | - |
| 1.3140 | 15250 | 1.2427 | - |
| 1.3183 | 15300 | 1.2051 | - |
| 1.3226 | 15350 | 0.9325 | - |
| 1.3269 | 15400 | 1.0465 | - |
| 1.3312 | 15450 | 1.1105 | - |
| 1.3355 | 15500 | 1.1853 | - |
| 1.3398 | 15550 | 1.1192 | - |
| 1.3441 | 15600 | 1.0018 | - |
| 1.3484 | 15650 | 1.1357 | - |
| 1.3527 | 15700 | 1.2298 | - |
| 1.3571 | 15750 | 1.0783 | - |
| 1.3614 | 15800 | 1.271 | - |
| 1.3657 | 15850 | 1.1724 | - |
| 1.3700 | 15900 | 1.273 | - |
| 1.3743 | 15950 | 1.2049 | - |
| 1.3786 | 16000 | 0.9902 | - |
| 1.3829 | 16050 | 1.1044 | - |
| 1.3872 | 16100 | 1.1175 | - |
| 1.3915 | 16150 | 1.0599 | - |
| 1.3958 | 16200 | 1.1392 | - |
| 1.3993 | 16240 | - | 0.5806 |
| 1.4001 | 16250 | 1.1629 | - |
| 1.4044 | 16300 | 1.1323 | - |
| 1.4088 | 16350 | 1.2096 | - |
| 1.4131 | 16400 | 0.9091 | - |
| 1.4174 | 16450 | 1.1328 | - |
| 1.4217 | 16500 | 1.1584 | - |
| 1.4260 | 16550 | 1.2615 | - |
| 1.4303 | 16600 | 1.1547 | - |
| 1.4346 | 16650 | 1.0805 | - |
| 1.4389 | 16700 | 1.2107 | - |
| 1.4432 | 16750 | 1.1184 | - |
| 1.4475 | 16800 | 1.0953 | - |
| 1.4518 | 16850 | 1.2088 | - |
| 1.4561 | 16900 | 1.0663 | - |
| 1.4605 | 16950 | 1.0531 | - |
| 1.4648 | 17000 | 1.0374 | - |
| 1.4691 | 17050 | 1.1432 | - |
| 1.4734 | 17100 | 1.0345 | - |
| 1.4777 | 17150 | 1.0081 | - |
| 1.4820 | 17200 | 1.0979 | - |
| 1.4863 | 17250 | 1.0554 | - |
| 1.4906 | 17300 | 1.1095 | - |
| 1.4949 | 17350 | 1.1157 | - |
| 1.4992 | 17400 | 1.0901 | 0.5940 |
| 1.5035 | 17450 | 1.2183 | - |
| 1.5078 | 17500 | 1.1127 | - |
| 1.5121 | 17550 | 0.9928 | - |
| 1.5165 | 17600 | 1.0612 | - |
| 1.5208 | 17650 | 1.2894 | - |
| 1.5251 | 17700 | 1.0407 | - |
| 1.5294 | 17750 | 1.0467 | - |
| 1.5337 | 17800 | 1.1305 | - |
| 1.5380 | 17850 | 1.2103 | - |
| 1.5423 | 17900 | 1.0317 | - |
| 1.5466 | 17950 | 0.8727 | - |
| 1.5509 | 18000 | 1.0039 | - |
| 1.5552 | 18050 | 1.1078 | - |
| 1.5595 | 18100 | 0.8985 | - |
| 1.5638 | 18150 | 1.073 | - |
| 1.5682 | 18200 | 1.1185 | - |
| 1.5725 | 18250 | 1.1867 | - |
| 1.5768 | 18300 | 1.0053 | - |
| 1.5811 | 18350 | 1.0772 | - |
| 1.5854 | 18400 | 1.1199 | - |
| 1.5897 | 18450 | 1.1933 | - |
| 1.5940 | 18500 | 1.1376 | - |
| 1.5983 | 18550 | 1.0323 | - |
| 1.5992 | 18560 | - | 0.6092 |
| 1.6026 | 18600 | 1.1533 | - |
| 1.6069 | 18650 | 1.1542 | - |
| 1.6112 | 18700 | 0.8537 | - |
| 1.6155 | 18750 | 1.2019 | - |
| 1.6199 | 18800 | 0.9037 | - |
| 1.6242 | 18850 | 1.1072 | - |
| 1.6285 | 18900 | 0.9368 | - |
| 1.6328 | 18950 | 0.8755 | - |
| 1.6371 | 19000 | 1.0589 | - |
| 1.6414 | 19050 | 1.2077 | - |
| 1.6457 | 19100 | 1.0273 | - |
| 1.6500 | 19150 | 0.9574 | - |
| 1.6543 | 19200 | 0.9654 | - |
| 1.6586 | 19250 | 0.9936 | - |
| 1.6629 | 19300 | 0.936 | - |
| 1.6672 | 19350 | 1.1334 | - |
| 1.6715 | 19400 | 1.1132 | - |
| 1.6759 | 19450 | 0.9652 | - |
| 1.6802 | 19500 | 0.9999 | - |
| 1.6845 | 19550 | 1.0588 | - |
| 1.6888 | 19600 | 0.8735 | - |
| 1.6931 | 19650 | 1.0931 | - |
| 1.6974 | 19700 | 0.9329 | - |
| 1.6991 | 19720 | - | 0.6159 |
| 1.7017 | 19750 | 1.0249 | - |
| 1.7060 | 19800 | 0.9529 | - |
| 1.7103 | 19850 | 1.0974 | - |
| 1.7146 | 19900 | 1.156 | - |
| 1.7189 | 19950 | 1.2541 | - |
| 1.7232 | 20000 | 1.1157 | - |
| 1.7276 | 20050 | 0.9739 | - |
| 1.7319 | 20100 | 0.8053 | - |
| 1.7362 | 20150 | 0.9672 | - |
| 1.7405 | 20200 | 0.9638 | - |
| 1.7448 | 20250 | 1.0336 | - |
| 1.7491 | 20300 | 1.0707 | - |
| 1.7534 | 20350 | 1.1464 | - |
| 1.7577 | 20400 | 0.9545 | - |
| 1.7620 | 20450 | 1.0381 | - |
| 1.7663 | 20500 | 1.217 | - |
| 1.7706 | 20550 | 1.1779 | - |
| 1.7749 | 20600 | 0.8474 | - |
| 1.7793 | 20650 | 1.062 | - |
| 1.7836 | 20700 | 0.8884 | - |
| 1.7879 | 20750 | 1.1615 | - |
| 1.7922 | 20800 | 1.0987 | - |
| 1.7965 | 20850 | 1.126 | - |
| 1.7991 | 20880 | - | 0.6320 |
| 1.8008 | 20900 | 1.0833 | - |
| 1.8051 | 20950 | 1.049 | - |
| 1.8094 | 21000 | 1.0177 | - |
| 1.8137 | 21050 | 1.1588 | - |
| 1.8180 | 21100 | 0.9397 | - |
| 1.8223 | 21150 | 0.9947 | - |
| 1.8266 | 21200 | 1.0446 | - |
| 1.8309 | 21250 | 1.1826 | - |
| 1.8353 | 21300 | 0.9498 | - |
| 1.8396 | 21350 | 1.3614 | - |
| 1.8439 | 21400 | 1.1025 | - |
| 1.8482 | 21450 | 1.028 | - |
| 1.8525 | 21500 | 1.0175 | - |
| 1.8568 | 21550 | 0.8465 | - |
| 1.8611 | 21600 | 0.9803 | - |
| 1.8654 | 21650 | 0.8592 | - |
| 1.8697 | 21700 | 0.9792 | - |
| 1.8740 | 21750 | 1.0933 | - |
| 1.8783 | 21800 | 0.8312 | - |
| 1.8826 | 21850 | 1.0615 | - |
| 1.8870 | 21900 | 1.0027 | - |
| 1.8913 | 21950 | 1.1034 | - |
| 1.8956 | 22000 | 1.0831 | - |
| 1.8990 | 22040 | - | 0.6385 |
| 1.8999 | 22050 | 0.9895 | - |
| 1.9042 | 22100 | 1.1019 | - |
| 1.9085 | 22150 | 1.1036 | - |
| 1.9128 | 22200 | 0.9039 | - |
| 1.9171 | 22250 | 1.0744 | - |
| 1.9214 | 22300 | 1.1484 | - |
| 1.9257 | 22350 | 1.0977 | - |
| 1.9300 | 22400 | 1.091 | - |
| 1.9343 | 22450 | 0.8213 | - |
| 1.9387 | 22500 | 1.0402 | - |
| 1.9430 | 22550 | 1.1233 | - |
| 1.9473 | 22600 | 1.0408 | - |
| 1.9516 | 22650 | 1.1515 | - |
| 1.9559 | 22700 | 1.1289 | - |
| 1.9602 | 22750 | 0.8997 | - |
| 1.9645 | 22800 | 0.9587 | - |
| 1.9688 | 22850 | 0.9716 | - |
| 1.9731 | 22900 | 0.9622 | - |
| 1.9774 | 22950 | 1.0119 | - |
| 1.9817 | 23000 | 1.0433 | - |
| 1.9860 | 23050 | 1.1165 | - |
| 1.9903 | 23100 | 0.9443 | - |
| 1.9947 | 23150 | 1.0661 | - |
| 1.9990 | 23200 | 1.0166 | 0.6490 |
| 2.0033 | 23250 | 0.982 | - |
| 2.0076 | 23300 | 1.1731 | - |
| 2.0119 | 23350 | 1.0112 | - |
| 2.0162 | 23400 | 1.0373 | - |
| 2.0205 | 23450 | 0.8866 | - |
| 2.0248 | 23500 | 0.9581 | - |
| 2.0291 | 23550 | 1.2335 | - |
| 2.0334 | 23600 | 0.9536 | - |
| 2.0377 | 23650 | 0.9767 | - |
| 2.0420 | 23700 | 1.0382 | - |
| 2.0464 | 23750 | 1.1288 | - |
| 2.0507 | 23800 | 0.8292 | - |
| 2.0550 | 23850 | 1.1083 | - |
| 2.0593 | 23900 | 0.9252 | - |
| 2.0636 | 23950 | 1.1108 | - |
| 2.0679 | 24000 | 1.1602 | - |
| 2.0722 | 24050 | 0.9616 | - |
| 2.0765 | 24100 | 1.0108 | - |
| 2.0808 | 24150 | 1.0974 | - |
| 2.0851 | 24200 | 0.9542 | - |
| 2.0894 | 24250 | 0.9269 | - |
| 2.0937 | 24300 | 1.0494 | - |
| 2.0981 | 24350 | 1.074 | - |
| 2.0989 | 24360 | - | 0.6508 |
| 2.1024 | 24400 | 0.8881 | - |
| 2.1067 | 24450 | 1.0372 | - |
| 2.1110 | 24500 | 1.0833 | - |
| 2.1153 | 24550 | 1.1226 | - |
| 2.1196 | 24600 | 1.1199 | - |
| 2.1239 | 24650 | 0.9263 | - |
| 2.1282 | 24700 | 0.9799 | - |
| 2.1325 | 24750 | 0.9388 | - |
| 2.1368 | 24800 | 1.1606 | - |
| 2.1411 | 24850 | 0.998 | - |
| 2.1454 | 24900 | 1.1349 | - |
| 2.1498 | 24950 | 1.1257 | - |
| 2.1541 | 25000 | 0.9132 | - |
| 2.1584 | 25050 | 1.068 | - |
| 2.1627 | 25100 | 0.9177 | - |
| 2.1670 | 25150 | 1.0174 | - |
| 2.1713 | 25200 | 1.1028 | - |
| 2.1756 | 25250 | 0.9742 | - |
| 2.1799 | 25300 | 0.9095 | - |
| 2.1842 | 25350 | 0.9706 | - |
| 2.1885 | 25400 | 1.1514 | - |
| 2.1928 | 25450 | 1.1459 | - |
| 2.1971 | 25500 | 1.2085 | - |
| 2.1989 | 25520 | - | 0.6582 |
| 2.2014 | 25550 | 0.9208 | - |
| 2.2058 | 25600 | 1.1146 | - |
| 2.2101 | 25650 | 1.003 | - |
| 2.2144 | 25700 | 0.9197 | - |
| 2.2187 | 25750 | 1.029 | - |
| 2.2230 | 25800 | 1.1035 | - |
| 2.2273 | 25850 | 1.0754 | - |
| 2.2316 | 25900 | 1.1675 | - |
| 2.2359 | 25950 | 0.9714 | - |
| 2.2402 | 26000 | 1.116 | - |
| 2.2445 | 26050 | 1.0347 | - |
| 2.2488 | 26100 | 1.027 | - |
| 2.2531 | 26150 | 0.9373 | - |
| 2.2575 | 26200 | 1.1202 | - |
| 2.2618 | 26250 | 0.8809 | - |
| 2.2661 | 26300 | 1.0182 | - |
| 2.2704 | 26350 | 1.0033 | - |
| 2.2747 | 26400 | 0.9937 | - |
| 2.2790 | 26450 | 1.0004 | - |
| 2.2833 | 26500 | 1.0014 | - |
| 2.2876 | 26550 | 1.2473 | - |
| 2.2919 | 26600 | 1.0909 | - |
| 2.2962 | 26650 | 1.1588 | - |
| 2.2988 | 26680 | - | 0.6620 |
| 2.3005 | 26700 | 0.9797 | - |
| 2.3048 | 26750 | 1.1163 | - |
| 2.3092 | 26800 | 1.1619 | - |
| 2.3135 | 26850 | 0.9237 | - |
| 2.3178 | 26900 | 0.9336 | - |
| 2.3221 | 26950 | 1.0761 | - |
| 2.3264 | 27000 | 1.0624 | - |
| 2.3307 | 27050 | 0.9467 | - |
| 2.3350 | 27100 | 1.2416 | - |
| 2.3393 | 27150 | 0.8832 | - |
| 2.3436 | 27200 | 1.0419 | - |
| 2.3479 | 27250 | 0.8805 | - |
| 2.3522 | 27300 | 1.063 | - |
| 2.3565 | 27350 | 1.0 | - |
| 2.3608 | 27400 | 0.9411 | - |
| 2.3652 | 27450 | 1.2561 | - |
| 2.3695 | 27500 | 1.0111 | - |
| 2.3738 | 27550 | 0.9595 | - |
| 2.3781 | 27600 | 0.8381 | - |
| 2.3824 | 27650 | 1.0234 | - |
| 2.3867 | 27700 | 0.8935 | - |
| 2.3910 | 27750 | 0.8965 | - |
| 2.3953 | 27800 | 1.0653 | - |
| 2.3988 | 27840 | - | 0.6641 |
| 2.3996 | 27850 | 1.0907 | - |
| 2.4039 | 27900 | 1.0517 | - |
| 2.4082 | 27950 | 0.9392 | - |
| 2.4125 | 28000 | 0.9978 | - |
| 2.4169 | 28050 | 1.0318 | - |
| 2.4212 | 28100 | 0.9021 | - |
| 2.4255 | 28150 | 0.9216 | - |
| 2.4298 | 28200 | 1.0857 | - |
| 2.4341 | 28250 | 0.9689 | - |
| 2.4384 | 28300 | 1.0085 | - |
| 2.4427 | 28350 | 1.0434 | - |
| 2.4470 | 28400 | 1.1309 | - |
| 2.4513 | 28450 | 0.9319 | - |
| 2.4556 | 28500 | 0.9562 | - |
| 2.4599 | 28550 | 0.9197 | - |
| 2.4642 | 28600 | 1.2111 | - |
| 2.4686 | 28650 | 1.0983 | - |
| 2.4729 | 28700 | 0.9562 | - |
| 2.4772 | 28750 | 0.9327 | - |
| 2.4815 | 28800 | 0.9716 | - |
| 2.4858 | 28850 | 1.0202 | - |
| 2.4901 | 28900 | 1.1367 | - |
| 2.4944 | 28950 | 0.9014 | - |
| 2.4987 | 29000 | 1.1313 | 0.6672 |
| 2.5030 | 29050 | 1.148 | - |
| 2.5073 | 29100 | 0.799 | - |
| 2.5116 | 29150 | 1.0012 | - |
| 2.5159 | 29200 | 0.7844 | - |
| 2.5202 | 29250 | 1.1639 | - |
| 2.5246 | 29300 | 0.9905 | - |
| 2.5289 | 29350 | 1.0579 | - |
| 2.5332 | 29400 | 0.9329 | - |
| 2.5375 | 29450 | 0.9496 | - |
| 2.5418 | 29500 | 0.9521 | - |
| 2.5461 | 29550 | 0.8535 | - |
| 2.5504 | 29600 | 1.019 | - |
| 2.5547 | 29650 | 1.1031 | - |
| 2.5590 | 29700 | 1.0894 | - |
| 2.5633 | 29750 | 1.0078 | - |
| 2.5676 | 29800 | 0.8403 | - |
| 2.5719 | 29850 | 0.916 | - |
| 2.5763 | 29900 | 1.2096 | - |
| 2.5806 | 29950 | 0.9969 | - |
| 2.5849 | 30000 | 1.1598 | - |
| 2.5892 | 30050 | 0.8849 | - |
| 2.5935 | 30100 | 1.0619 | - |
| 2.5978 | 30150 | 0.9554 | - |
| 2.5987 | 30160 | - | 0.6693 |
| 2.6021 | 30200 | 1.1025 | - |
| 2.6064 | 30250 | 1.1252 | - |
| 2.6107 | 30300 | 0.8108 | - |
| 2.6150 | 30350 | 0.927 | - |
| 2.6193 | 30400 | 1.1574 | - |
| 2.6236 | 30450 | 1.0098 | - |
| 2.6280 | 30500 | 0.8702 | - |
| 2.6323 | 30550 | 0.9672 | - |
| 2.6366 | 30600 | 0.9361 | - |
| 2.6409 | 30650 | 0.9801 | - |
| 2.6452 | 30700 | 1.114 | - |
| 2.6495 | 30750 | 0.8666 | - |
| 2.6538 | 30800 | 0.9648 | - |
| 2.6581 | 30850 | 0.9423 | - |
| 2.6624 | 30900 | 1.059 | - |
| 2.6667 | 30950 | 0.9149 | - |
| 2.6710 | 31000 | 0.8954 | - |
| 2.6753 | 31050 | 0.8769 | - |
| 2.6796 | 31100 | 0.8124 | - |
| 2.6840 | 31150 | 1.151 | - |
| 2.6883 | 31200 | 1.0145 | - |
| 2.6926 | 31250 | 0.9653 | - |
| 2.6969 | 31300 | 1.136 | - |
| 2.6986 | 31320 | - | 0.6693 |
| 2.7012 | 31350 | 0.9122 | - |
| 2.7055 | 31400 | 1.0161 | - |
| 2.7098 | 31450 | 1.0152 | - |
| 2.7141 | 31500 | 1.1181 | - |
| 2.7184 | 31550 | 0.8969 | - |
| 2.7227 | 31600 | 1.2101 | - |
| 2.7270 | 31650 | 1.0958 | - |
| 2.7313 | 31700 | 0.9548 | - |
| 2.7357 | 31750 | 0.9755 | - |
| 2.7400 | 31800 | 0.9796 | - |
| 2.7443 | 31850 | 1.0564 | - |
| 2.7486 | 31900 | 0.9581 | - |
| 2.7529 | 31950 | 0.8607 | - |
| 2.7572 | 32000 | 0.8933 | - |
| 2.7615 | 32050 | 0.9828 | - |
| 2.7658 | 32100 | 1.1992 | - |
| 2.7701 | 32150 | 1.0162 | - |
| 2.7744 | 32200 | 0.8406 | - |
| 2.7787 | 32250 | 0.7896 | - |
| 2.7830 | 32300 | 1.0311 | - |
| 2.7874 | 32350 | 1.0507 | - |
| 2.7917 | 32400 | 1.136 | - |
| 2.7960 | 32450 | 1.0504 | - |
| 2.7986 | 32480 | - | 0.6697 |
| 2.8003 | 32500 | 0.9271 | - |
| 2.8046 | 32550 | 1.0412 | - |
| 2.8089 | 32600 | 0.8542 | - |
| 2.8132 | 32650 | 1.1015 | - |
| 2.8175 | 32700 | 0.9957 | - |
| 2.8218 | 32750 | 1.0845 | - |
| 2.8261 | 32800 | 1.1226 | - |
| 2.8304 | 32850 | 1.0235 | - |
| 2.8347 | 32900 | 0.996 | - |
| 2.8390 | 32950 | 1.0855 | - |
| 2.8434 | 33000 | 1.2322 | - |
| 2.8477 | 33050 | 0.999 | - |
| 2.8520 | 33100 | 1.04 | - |
| 2.8563 | 33150 | 1.1466 | - |
| 2.8606 | 33200 | 0.9061 | - |
| 2.8649 | 33250 | 1.0011 | - |
| 2.8692 | 33300 | 1.0205 | - |
| 2.8735 | 33350 | 1.0136 | - |
| 2.8778 | 33400 | 0.8956 | - |
| 2.8821 | 33450 | 0.9722 | - |
| 2.8864 | 33500 | 0.8962 | - |
| 2.8907 | 33550 | 0.9545 | - |
| 2.8951 | 33600 | 0.8474 | - |
| 2.8985 | 33640 | - | 0.6700 |
| 2.8994 | 33650 | 0.782 | - |
| 2.9037 | 33700 | 0.9551 | - |
| 2.9080 | 33750 | 1.0217 | - |
| 2.9123 | 33800 | 0.8188 | - |
| 2.9166 | 33850 | 1.0652 | - |
| 2.9209 | 33900 | 1.1314 | - |
| 2.9252 | 33950 | 0.9487 | - |
| 2.9295 | 34000 | 0.9906 | - |
| 2.9338 | 34050 | 1.1317 | - |
| 2.9381 | 34100 | 0.9139 | - |
| 2.9424 | 34150 | 0.9394 | - |
| 2.9468 | 34200 | 0.9904 | - |
| 2.9511 | 34250 | 1.0758 | - |
| 2.9554 | 34300 | 0.9388 | - |
| 2.9597 | 34350 | 0.9417 | - |
| 2.9640 | 34400 | 0.9871 | - |
| 2.9683 | 34450 | 1.0431 | - |
| 2.9726 | 34500 | 1.0538 | - |
| 2.9769 | 34550 | 1.078 | - |
| 2.9812 | 34600 | 1.0972 | - |
| 2.9855 | 34650 | 1.0294 | - |
| 2.9898 | 34700 | 1.0387 | - |
| 2.9941 | 34750 | 0.8923 | - |
| 2.9984 | 34800 | 1.0937 | 0.6698 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
Base model
answerdotai/ModernBERT-base