Matryoshka Representation Learning
Paper • 2205.13147 • Published • 27
How to use Akryl/modernbert-embed-base-akryl-matryoshka with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Akryl/modernbert-embed-base-akryl-matryoshka")
sentences = [
"<1-hop>\n\nOpinion\nI have audited the financial statements of the Ministry of Defence and Veteran Affairs (MoDVA), which comprise the Statement of Financial Position as at 30 th June 2023, the Statement of Financial Performance, Statement of Changes in Equity and Statement of Cash Flows, together with other accompanying statements for the year then ended, and notes to the financial statements, including a summary of significant accounting policies.\nIn my opinion, the accompanying financial statements of the Ministry of Defence and Veteran Affairs for the financial year ended 30 th June 2023 are prepared, in all material respects, in accordance with Section 51 of the Public Finance Management Act (PFMA), 2015 and the Financial Reporting Guide, 2018 (as amended).",
"How does the audit process for Kalungu District Local Government and Pader District Local Government follow the Constitution of the Republic of Uganda and what standards are used to ensure compliance with ethical and legal requirements?",
"What financial statements were audited for MoDVA and KCCA?",
"How were the water grant funds utilized in the rehabilitation of existing water sources and the drilling of boreholes, and what were the outcomes of these projects?"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Akryl/modernbert-embed-base-akryl-matryoshka")
# Run inference
queries = [
"\u003c1-hop\u003e\n\n4.2.6 PDM SACCO Operations\n\uf0b7 A loan applicant must be a member of a registered subsistence household on the PDMIS, be a member of a PDM Enterprise Group that is a member of the PDM SACCO.\n\uf0b7 All beneficiaries should be members of a registered subsistence household on the Parish Development Management Information System (applies before 5th June 2023).\n\uf0b7 Subsistence households applying to access PRF should be determined and selected at village level through a vetting meeting convened by the enterprise groups and attended by LC1 Chairpersons (applies after 5th June 2023).\n\uf0b7 For farming enterprises, the borrower must obtain an agriculture insurance policy under the Uganda Agriculture Insurance Scheme (UAIS).\nI made the following observations;\n1., Activity = Selection and Implementation of Prioritized/Flagship Projects. 1., Observations = \uf0b7 All the 10 parishes did not flagship contrary to guidelines. \uf0b7 All the 10 parishes selected projects that were inconsistent the LG priority commodities. \uf0b7 11 out of farmer enterprises/house holds implemented projects that are. 1., Management Response = select projects the flagship with selected 20 that sensitizations utilization of projects by various fora Beneficiaries advised to experiences Frequent beneficiaries encouraged operate.. 1., Management Response = The Accounting Officer explained on proper PRF on prioritized all stakeholders at is ongoing. of PRF have been conduct monthly meetings for members to share and challenges. visits among of PRF are also like the way VSL. 2., Activity = Insurance Policy for Farming Enterprises.. 2., Observations = Appendix 5 (g) I noted that all the 11 PRF beneficiaries who carried out farming enterprises in 8 PDM SACCOs did not obtain agricultural insurance policies from UAIS. Refer to Appendix. 2., Management Response = The Accounting Officer explained that since the selected households have received enterprises will obtain agricultural policies from guidelines put in place.. 2., Management Response = PRF, farming be mobilised to insurance UAIS per the",
]
documents = [
'What are the requirements for subsistence households to access PRF, and how does the insurance policy requirement for farming enterprises relate to these conditions?',
'How do the financial figures for net assets and cash balances compare between the years ending 30 June 2017 and 30 June 2021, and what trends can be observed in the financial statements during this period?',
'What is the management responsibility and role of the Accounting Officer in preparing financial statements for Kalungu District Local Government?',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7440, 0.3670, 0.5151]])
dim_768InformationRetrievalEvaluator with these parameters:{
"truncate_dim": 768
}
| Metric | Value |
|---|---|
| cosine_accuracy@3 | 0.6585 |
| cosine_accuracy@5 | 0.7317 |
| cosine_accuracy@7 | 0.8537 |
| cosine_precision@3 | 0.2195 |
| cosine_precision@5 | 0.1463 |
| cosine_precision@7 | 0.122 |
| cosine_recall@3 | 0.6098 |
| cosine_recall@5 | 0.6829 |
| cosine_recall@7 | 0.8049 |
| cosine_ndcg@3 | 0.4949 |
| cosine_ndcg@5 | 0.5254 |
| cosine_ndcg@7 | 0.5671 |
| cosine_mrr@10 | 0.5102 |
| cosine_map@100 | 0.4906 |
dim_512InformationRetrievalEvaluator with these parameters:{
"truncate_dim": 512
}
| Metric | Value |
|---|---|
| cosine_accuracy@3 | 0.6341 |
| cosine_accuracy@5 | 0.7805 |
| cosine_accuracy@7 | 0.8537 |
| cosine_precision@3 | 0.2114 |
| cosine_precision@5 | 0.1561 |
| cosine_precision@7 | 0.122 |
| cosine_recall@3 | 0.5854 |
| cosine_recall@5 | 0.7317 |
| cosine_recall@7 | 0.8049 |
| cosine_ndcg@3 | 0.5047 |
| cosine_ndcg@5 | 0.5645 |
| cosine_ndcg@7 | 0.5889 |
| cosine_mrr@10 | 0.5474 |
| cosine_map@100 | 0.5138 |
dim_256InformationRetrievalEvaluator with these parameters:{
"truncate_dim": 256
}
| Metric | Value |
|---|---|
| cosine_accuracy@3 | 0.6829 |
| cosine_accuracy@5 | 0.7805 |
| cosine_accuracy@7 | 0.8537 |
| cosine_precision@3 | 0.2276 |
| cosine_precision@5 | 0.1561 |
| cosine_precision@7 | 0.122 |
| cosine_recall@3 | 0.6341 |
| cosine_recall@5 | 0.7317 |
| cosine_recall@7 | 0.8049 |
| cosine_ndcg@3 | 0.4859 |
| cosine_ndcg@5 | 0.5279 |
| cosine_ndcg@7 | 0.5529 |
| cosine_mrr@10 | 0.4897 |
| cosine_map@100 | 0.4687 |
text and question| text | question | |
|---|---|---|
| type | string | string |
| details |
|
|
| text | question |
|---|---|
<2-hop> |
How were fund management and budget approval handled in the Education Development grant projects? |
Auditor's Responsibilities for the audit of the Financial Statements |
What are the auditor's responsibilities regarding financial statements? |
<1-hop> |
What are the key responsibilities of an auditor in ensuring financial statements are free from material misstatement? |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256
],
"matryoshka_weights": [
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: epochper_device_eval_batch_size: 16gradient_accumulation_steps: 64learning_rate: 2e-05num_train_epochs: 4lr_scheduler_type: cosinewarmup_ratio: 0.1bf16: Trueload_best_model_at_end: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 64eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | dim_768_cosine_ndcg@7 | dim_512_cosine_ndcg@7 | dim_256_cosine_ndcg@7 |
|---|---|---|---|---|
| 1.0 | 1 | 0.5313 | 0.4963 | 0.5033 |
| 2.0 | 2 | 0.5533 | 0.5192 | 0.5376 |
| 3.0 | 3 | 0.5721 | 0.5729 | 0.5536 |
| 4.0 | 4 | 0.5671 | 0.5889 | 0.5529 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
answerdotai/ModernBERT-base