ModernBERT Embed base Akryl Matryoshka

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/modernbert-embed-base
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Akryl/modernbert-embed-base-akryl-matryoshka")
# Run inference
queries = [
    "\u003c1-hop\u003e\n\n4.2.6 PDM SACCO Operations\n\uf0b7 A loan applicant must be a member of a registered subsistence household on the PDMIS, be a member of a PDM Enterprise Group that is a member of the PDM SACCO.\n\uf0b7 All beneficiaries should be members of a registered subsistence household on the Parish Development Management Information System (applies before 5th June 2023).\n\uf0b7 Subsistence  households  applying  to  access  PRF  should  be  determined  and  selected  at village level through a vetting meeting convened by the enterprise groups and attended by LC1 Chairpersons (applies after 5th June 2023).\n\uf0b7 For  farming enterprises, the borrower must obtain an agriculture insurance policy under the Uganda Agriculture Insurance Scheme (UAIS).\nI made the following observations;\n1., Activity = Selection  and  Implementation  of  Prioritized/Flagship  Projects. 1., Observations = \uf0b7 All the 10 parishes  did  not  flagship  contrary  to  guidelines.   \uf0b7 All the 10 parishes  selected  projects  that  were  inconsistent  the  LG  priority  commodities.   \uf0b7 11  out  of  farmer  enterprises/house holds implemented  projects  that  are. 1., Management Response = select  projects  the  flagship  with  selected  20  that  sensitizations  utilization  of  projects  by  various  fora  Beneficiaries  advised  to  experiences  Frequent  beneficiaries  encouraged  operate.. 1., Management Response = The  Accounting  Officer  explained  on  proper  PRF  on  prioritized  all  stakeholders  at  is  ongoing.  of  PRF  have  been  conduct  monthly  meetings  for  members  to  share  and  challenges.  visits  among  of  PRF  are  also  like  the  way  VSL. 2., Activity = Insurance  Policy  for  Farming Enterprises.. 2., Observations = Appendix 5 (g) I noted that all the 11 PRF  beneficiaries  who  carried  out  farming  enterprises  in  8  PDM  SACCOs  did  not  obtain  agricultural  insurance  policies  from  UAIS.  Refer  to  Appendix. 2., Management Response = The  Accounting  Officer  explained  that since the selected households  have  received  enterprises  will  obtain  agricultural  policies  from  guidelines put in place.. 2., Management Response = PRF,  farming  be  mobilised  to  insurance  UAIS  per  the",
]
documents = [
    'What are the requirements for subsistence households to access PRF, and how does the insurance policy requirement for farming enterprises relate to these conditions?',
    'How do the financial figures for net assets and cash balances compare between the years ending 30 June 2017 and 30 June 2021, and what trends can be observed in the financial statements during this period?',
    'What is the management responsibility and role of the Accounting Officer in preparing financial statements for Kalungu District Local Government?',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7440, 0.3670, 0.5151]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@3 0.6585
cosine_accuracy@5 0.7317
cosine_accuracy@7 0.8537
cosine_precision@3 0.2195
cosine_precision@5 0.1463
cosine_precision@7 0.122
cosine_recall@3 0.6098
cosine_recall@5 0.6829
cosine_recall@7 0.8049
cosine_ndcg@3 0.4949
cosine_ndcg@5 0.5254
cosine_ndcg@7 0.5671
cosine_mrr@10 0.5102
cosine_map@100 0.4906

Information Retrieval

Metric Value
cosine_accuracy@3 0.6341
cosine_accuracy@5 0.7805
cosine_accuracy@7 0.8537
cosine_precision@3 0.2114
cosine_precision@5 0.1561
cosine_precision@7 0.122
cosine_recall@3 0.5854
cosine_recall@5 0.7317
cosine_recall@7 0.8049
cosine_ndcg@3 0.5047
cosine_ndcg@5 0.5645
cosine_ndcg@7 0.5889
cosine_mrr@10 0.5474
cosine_map@100 0.5138

Information Retrieval

Metric Value
cosine_accuracy@3 0.6829
cosine_accuracy@5 0.7805
cosine_accuracy@7 0.8537
cosine_precision@3 0.2276
cosine_precision@5 0.1561
cosine_precision@7 0.122
cosine_recall@3 0.6341
cosine_recall@5 0.7317
cosine_recall@7 0.8049
cosine_ndcg@3 0.4859
cosine_ndcg@5 0.5279
cosine_ndcg@7 0.5529
cosine_mrr@10 0.4897
cosine_map@100 0.4687

Training Details

Training Dataset

Unnamed Dataset

  • Size: 402 training samples
  • Columns: text and question
  • Approximate statistics based on the first 402 samples:
    text question
    type string string
    details
    • min: 39 tokens
    • mean: 279.24 tokens
    • max: 698 tokens
    • min: 8 tokens
    • mean: 28.59 tokens
    • max: 76 tokens
  • Samples:
    text question
    <2-hop>

    4.1.1 Positive observations
    I noted the following areas where management had commendable performance;
     The water grant was incorporated into the entity's budget which was approved by Parliament/Council for release and implementation.
     I noted that 6 out of 6 (100%) of the budgeted projects were provided for in the approved five-year development plan.
     All the projects implemented were eligible.
     There was an agreement between the land owners and the community members to protect government's rights to ownership of the land where the project is being constructed.
    11
    How were fund management and budget approval handled in the Education Development grant projects?
    Auditor's Responsibilities for the audit of the Financial Statements
    From the matters communicated with the Accounting Officer, I determine those matters that were of most significance in the audit of the financial statements of the current period and are therefore the key audit matters. I describe these matters in my auditor's report unless law or regulation precludes public disclosure about the matter or when, in extremely rare circumstances, I determine that a matter should not be communicated in my report because the adverse consequences of doing so would reasonably be expected to outweigh the public interest benefits of such communication.
    What are the auditor's responsibilities regarding financial statements?
    <1-hop>

    Auditor's Responsibilities for the audit of the Financial Statements
    My objectives are to obtain reasonable assurance about whether the financial statements as a whole are free from material misstatement, whether due to fraud or error, and to issue an auditor's report that includes my opinion. Reasonable assurance is a high level of assurance but is not a guarantee that an audit conducted in accordance with ISSAIs will always detect a material misstatement, when it exists. Misstatements can arise from fraud or error and are considered material if, individually or in aggregate, they could reasonably be expected to influence the economic decisions of users, taken on the basis of these financial statements.
    As part of an audit in accordance with ISSAIs, I exercise professional judgment and maintain professional skepticism throughout the audit. I also:
     Identify and assess the risks of material misstatement of the financial statements, whether due to fraud ...
    What are the key responsibilities of an auditor in ensuring financial statements are free from material misstatement?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256
        ],
        "matryoshka_weights": [
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 64
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 64
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step dim_768_cosine_ndcg@7 dim_512_cosine_ndcg@7 dim_256_cosine_ndcg@7
1.0 1 0.5313 0.4963 0.5033
2.0 2 0.5533 0.5192 0.5376
3.0 3 0.5721 0.5729 0.5536
4.0 4 0.5671 0.5889 0.5529
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.1
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.1.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
6
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Akryl/modernbert-embed-base-akryl-matryoshka

Finetuned
(95)
this model

Space using Akryl/modernbert-embed-base-akryl-matryoshka 1

Evaluation results