Yeroyan's picture
Add new SentenceTransformer model
a5b989c verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:402
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: nomic-ai/modernbert-embed-base
widget:
  - source_sentence: >-
      <1-hop>


      Opinion

      I  have  audited  the  financial  statements  of  the  Ministry  of 
      Defence  and  Veteran  Affairs (MoDVA),  which  comprise  the  Statement 
      of  Financial  Position  as  at  30 th June  2023,  the Statement of
      Financial Performance, Statement of Changes in Equity and Statement of
      Cash Flows, together with other accompanying statements for the year then
      ended, and notes to the financial statements, including a summary of
      significant accounting policies.

      In my opinion, the accompanying financial statements of the Ministry of
      Defence and Veteran Affairs for the financial year ended 30 th  June 2023
      are prepared, in all material respects, in accordance with Section 51 of
      the Public Finance Management Act (PFMA), 2015 and the Financial Reporting
      Guide, 2018 (as amended).
    sentences:
      - >-
        How does the audit process for Kalungu District Local Government and
        Pader District Local Government follow the Constitution of the Republic
        of Uganda and what standards are used to ensure compliance with ethical
        and legal requirements?
      - What financial statements were audited for MoDVA and KCCA?
      - >-
        How were the water grant funds utilized in the rehabilitation of
        existing water sources and the drilling of boreholes, and what were the
        outcomes of these projects?
  - source_sentence: >-
      <2-hop>


      2.0 Management of the Government Salary Payroll

      gratuity hence low = . , Observation Utilization of the Wage Budget The
      DLG had an approved wage budget of UGX.14,627,681,773, and The obtained
      supplementary funding of UGX.11,328,430,742 resulting into a.which.Unspent
      Balance UGX. Bn.10.689.From the analysis, I noted that; There was an under
      absorption of UGX.10,689,218,043 The supplementary funding of
      UGX.10,814,119,300 was not utilized. The Accounting Officer explained that
      there was a ban on recruitment and a s such the district could recruit
      staff to absorb the wage..He further explained that the incomplete
      documentation   prevented pensioners   from   accessing   their   pensions
      and   gratuity hence low = . , Recommendation Accounting should laisse
      with.of Public service to clear recruitments to be able to absorb the
      wage. Ministry.... = the Accounting Officer should. , Recommendation
      Accounting should laisse with..... = . , Observation Utilization of the
      Wage Budget The DLG had an approved wage budget of UGX.14,627,681,773, and
      The obtained supplementary funding of UGX.11,328,430,742 resulting into
      a.revised UGX.25,831,211,258 (99.5.Approve d Budget UGX. Bn.14.627.From
      the analysis, I noted that; There was an under absorption of
      UGX.10,689,218,043 The supplementary funding of UGX.10,814,119,300 was not
      utilized. The Accounting Officer explained that there was a ban on
      recruitment and a s such the district could recruit staff to absorb the
      wage..He further explained that the incomplete documentation   prevented
      pensioners   from   accessing   their   pensions and
    sentences:
      - >-
        How does the utilization of the wage budget compare between the initial
        approved budget of UGX. Bn.14.627 and the revised budget, and what were
        the reasons for the unspent balance of UGX. Bn.10.689?
      - >-
        How did the budget cuts affect the Uganda Road Fund's maintenance
        activities and what were the actual expenditures for routine mechanized
        maintenance?
      - >-
        How does the disbursement of Parish Revolving Fund affect the financial
        operations of PDM SACCOs?
  - source_sentence: >-
      <2-hop>


      2.0 Management of the Government Salary Payroll

      In a letter to the Auditor General dated 29th November 2022 referenced HRM
      155/222/02, the Minister of Finance, Planning and Economic Development
      (MoFPED) highlighted that, despite the reforms introduced by Government to
      mitigate against persistent supplementary requests for additional funds to
      cater for wage shortfalls, there has not been significant results and yet
      expenditure on wage is a substantial percentage of all entity budgets.
      Other anomalies highlighted included: payments for non-existent employees 
      underpayments to staff and irregular overpayments to staff, among others.

      Accordingly, I carried out a special audit on wage payroll in Local
      Government (LG) entities to establish the root causes of the identified
      challenges and propose remedial measures.  The audit covered four (4) FYs
      from 2019/2020 to 2022/2023 to which I issued a separate detailed audit
      report and below is a summary of the findings from the special audit; key

      Iganga DLG had a wage budget of UGX.31,651,868,130, out of which
      UGX.30,714,036,110 was utilised for the period under review. Below is a
      summary of the key findings from the special audit;
    sentences:
      - >-
        What are key audit matters and how they related to audit of financial
        statements?
      - >-
        What were the key findings of the special audit on wage payroll in Local
        Government entities, and how did the wage budget utilization compare
        between Lira DLG and Iganga DLG?
      - >-
        What topics are covered in the Auditor General's report on Namutumba
        District Local Government?
  - source_sentence: >-
      a. The physical structure is condemned

      The  Ministry  of  health  engineering  and  Physical  planning 
      department  declared  the Grade  A  hospital  structure  unfit  for 
      human  occupation  due  to  the  dilapidated structure and broken plumbing
      and electrical components that were beyond repair.

      This  therefore,  reduced  the  number  of  beds  available  in  the 
      Regional  Referral Hospital to the public. Refer to pictures below;

      15

      Services that were exclusively offered at the campus i.e. the Diabetic
      clinic and the Mental disabilities clinic have been affected and have
      since not gotten a permanent placement in the regional referral hospital.

      The Accounting Officer explained that renovation works were set to
      commence with the project currently under detailed technical review by the
      infrastructure division of the Ministry of Health.
    sentences:
      - Why did the Ministry of Health declare the hospital structure unfit?
      - What is the expenditure for Ihe service delivery under focus areas?
      - >-
        How did the DLGs and LGs fail to meet their deadlines and what were the
        consequences for both procurement and reporting?
  - source_sentence: >-
      <1-hop>


      4.2.6 PDM SACCO Operations

       A loan applicant must be a member of a registered subsistence household
      on the PDMIS, be a member of a PDM Enterprise Group that is a member of
      the PDM SACCO.

       All beneficiaries should be members of a registered subsistence
      household on the Parish Development Management Information System (applies
      before 5th June 2023).

       Subsistence  households  applying  to  access  PRF  should  be 
      determined  and  selected  at village level through a vetting meeting
      convened by the enterprise groups and attended by LC1 Chairpersons
      (applies after 5th June 2023).

       For  farming enterprises, the borrower must obtain an agriculture
      insurance policy under the Uganda Agriculture Insurance Scheme (UAIS).

      I made the following observations;

      1., Activity = Selection  and  Implementation  of  Prioritized/Flagship 
      Projects. 1., Observations =  All the 10 parishes  did  not  flagship 
      contrary  to  guidelines.    All the 10 parishes  selected  projects 
      that  were  inconsistent  the  LG  priority  commodities.    11  out  of 
      farmer  enterprises/house holds implemented  projects  that  are. 1.,
      Management Response = select  projects  the  flagship  with  selected  20 
      that  sensitizations  utilization  of  projects  by  various  fora 
      Beneficiaries  advised  to  experiences  Frequent  beneficiaries 
      encouraged  operate.. 1., Management Response = The  Accounting  Officer 
      explained  on  proper  PRF  on  prioritized  all  stakeholders  at  is 
      ongoing.  of  PRF  have  been  conduct  monthly  meetings  for  members 
      to  share  and  challenges.  visits  among  of  PRF  are  also  like  the 
      way  VSL. 2., Activity = Insurance  Policy  for  Farming Enterprises.. 2.,
      Observations = Appendix 5 (g) I noted that all the 11 PRF  beneficiaries 
      who  carried  out  farming  enterprises  in  8  PDM  SACCOs  did  not 
      obtain  agricultural  insurance  policies  from  UAIS.  Refer  to 
      Appendix. 2., Management Response = The  Accounting  Officer  explained 
      that since the selected households  have  received  enterprises  will 
      obtain  agricultural  policies  from  guidelines put in place.. 2.,
      Management Response = PRF,  farming  be  mobilised  to  insurance  UAIS 
      per  the
    sentences:
      - >-
        How do the financial figures for net assets and cash balances compare
        between the years ending 30 June 2017 and 30 June 2021, and what trends
        can be observed in the financial statements during this period?
      - >-
        What are the requirements for subsistence households to access PRF, and
        how does the insurance policy requirement for farming enterprises relate
        to these conditions?
      - >-
        What is the management responsibility and role of the Accounting Officer
        in preparing financial statements for Kalungu District Local Government?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@7
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@7
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@7
  - cosine_ndcg@3
  - cosine_ndcg@5
  - cosine_ndcg@7
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: ModernBERT Embed base Akryl Matryoshka
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@3
            value: 0.6585365853658537
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.7317073170731707
            name: Cosine Accuracy@5
          - type: cosine_accuracy@7
            value: 0.8536585365853658
            name: Cosine Accuracy@7
          - type: cosine_precision@3
            value: 0.21951219512195122
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.14634146341463417
            name: Cosine Precision@5
          - type: cosine_precision@7
            value: 0.12195121951219512
            name: Cosine Precision@7
          - type: cosine_recall@3
            value: 0.6097560975609756
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.6829268292682927
            name: Cosine Recall@5
          - type: cosine_recall@7
            value: 0.8048780487804879
            name: Cosine Recall@7
          - type: cosine_ndcg@3
            value: 0.49491499938801414
            name: Cosine Ndcg@3
          - type: cosine_ndcg@5
            value: 0.5253590462997537
            name: Cosine Ndcg@5
          - type: cosine_ndcg@7
            value: 0.5671252505489257
            name: Cosine Ndcg@7
          - type: cosine_mrr@10
            value: 0.5102497096399535
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.49059516966021033
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@3
            value: 0.6341463414634146
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.7804878048780488
            name: Cosine Accuracy@5
          - type: cosine_accuracy@7
            value: 0.8536585365853658
            name: Cosine Accuracy@7
          - type: cosine_precision@3
            value: 0.2113821138211382
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.15609756097560976
            name: Cosine Precision@5
          - type: cosine_precision@7
            value: 0.12195121951219512
            name: Cosine Precision@7
          - type: cosine_recall@3
            value: 0.5853658536585366
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.7317073170731707
            name: Cosine Recall@5
          - type: cosine_recall@7
            value: 0.8048780487804879
            name: Cosine Recall@7
          - type: cosine_ndcg@3
            value: 0.5047183708109944
            name: Cosine Ndcg@3
          - type: cosine_ndcg@5
            value: 0.5645375926627944
            name: Cosine Ndcg@5
          - type: cosine_ndcg@7
            value: 0.5889278365652334
            name: Cosine Ndcg@7
          - type: cosine_mrr@10
            value: 0.547444831591173
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.5138320685383551
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@3
            value: 0.6829268292682927
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.7804878048780488
            name: Cosine Accuracy@5
          - type: cosine_accuracy@7
            value: 0.8536585365853658
            name: Cosine Accuracy@7
          - type: cosine_precision@3
            value: 0.22764227642276424
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.15609756097560976
            name: Cosine Precision@5
          - type: cosine_precision@7
            value: 0.12195121951219512
            name: Cosine Precision@7
          - type: cosine_recall@3
            value: 0.6341463414634146
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.7317073170731707
            name: Cosine Recall@5
          - type: cosine_recall@7
            value: 0.8048780487804879
            name: Cosine Recall@7
          - type: cosine_ndcg@3
            value: 0.4859132860604887
            name: Cosine Ndcg@3
          - type: cosine_ndcg@5
            value: 0.5279305112383806
            name: Cosine Ndcg@5
          - type: cosine_ndcg@7
            value: 0.5528786540133731
            name: Cosine Ndcg@7
          - type: cosine_mrr@10
            value: 0.48969221835075494
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.46868070953436813
            name: Cosine Map@100

ModernBERT Embed base Akryl Matryoshka

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/modernbert-embed-base
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Akryl/modernbert-embed-base-akryl-matryoshka")
# Run inference
queries = [
    "\u003c1-hop\u003e\n\n4.2.6 PDM SACCO Operations\n\uf0b7 A loan applicant must be a member of a registered subsistence household on the PDMIS, be a member of a PDM Enterprise Group that is a member of the PDM SACCO.\n\uf0b7 All beneficiaries should be members of a registered subsistence household on the Parish Development Management Information System (applies before 5th June 2023).\n\uf0b7 Subsistence  households  applying  to  access  PRF  should  be  determined  and  selected  at village level through a vetting meeting convened by the enterprise groups and attended by LC1 Chairpersons (applies after 5th June 2023).\n\uf0b7 For  farming enterprises, the borrower must obtain an agriculture insurance policy under the Uganda Agriculture Insurance Scheme (UAIS).\nI made the following observations;\n1., Activity = Selection  and  Implementation  of  Prioritized/Flagship  Projects. 1., Observations = \uf0b7 All the 10 parishes  did  not  flagship  contrary  to  guidelines.   \uf0b7 All the 10 parishes  selected  projects  that  were  inconsistent  the  LG  priority  commodities.   \uf0b7 11  out  of  farmer  enterprises/house holds implemented  projects  that  are. 1., Management Response = select  projects  the  flagship  with  selected  20  that  sensitizations  utilization  of  projects  by  various  fora  Beneficiaries  advised  to  experiences  Frequent  beneficiaries  encouraged  operate.. 1., Management Response = The  Accounting  Officer  explained  on  proper  PRF  on  prioritized  all  stakeholders  at  is  ongoing.  of  PRF  have  been  conduct  monthly  meetings  for  members  to  share  and  challenges.  visits  among  of  PRF  are  also  like  the  way  VSL. 2., Activity = Insurance  Policy  for  Farming Enterprises.. 2., Observations = Appendix 5 (g) I noted that all the 11 PRF  beneficiaries  who  carried  out  farming  enterprises  in  8  PDM  SACCOs  did  not  obtain  agricultural  insurance  policies  from  UAIS.  Refer  to  Appendix. 2., Management Response = The  Accounting  Officer  explained  that since the selected households  have  received  enterprises  will  obtain  agricultural  policies  from  guidelines put in place.. 2., Management Response = PRF,  farming  be  mobilised  to  insurance  UAIS  per  the",
]
documents = [
    'What are the requirements for subsistence households to access PRF, and how does the insurance policy requirement for farming enterprises relate to these conditions?',
    'How do the financial figures for net assets and cash balances compare between the years ending 30 June 2017 and 30 June 2021, and what trends can be observed in the financial statements during this period?',
    'What is the management responsibility and role of the Accounting Officer in preparing financial statements for Kalungu District Local Government?',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7440, 0.3670, 0.5151]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@3 0.6585
cosine_accuracy@5 0.7317
cosine_accuracy@7 0.8537
cosine_precision@3 0.2195
cosine_precision@5 0.1463
cosine_precision@7 0.122
cosine_recall@3 0.6098
cosine_recall@5 0.6829
cosine_recall@7 0.8049
cosine_ndcg@3 0.4949
cosine_ndcg@5 0.5254
cosine_ndcg@7 0.5671
cosine_mrr@10 0.5102
cosine_map@100 0.4906

Information Retrieval

Metric Value
cosine_accuracy@3 0.6341
cosine_accuracy@5 0.7805
cosine_accuracy@7 0.8537
cosine_precision@3 0.2114
cosine_precision@5 0.1561
cosine_precision@7 0.122
cosine_recall@3 0.5854
cosine_recall@5 0.7317
cosine_recall@7 0.8049
cosine_ndcg@3 0.5047
cosine_ndcg@5 0.5645
cosine_ndcg@7 0.5889
cosine_mrr@10 0.5474
cosine_map@100 0.5138

Information Retrieval

Metric Value
cosine_accuracy@3 0.6829
cosine_accuracy@5 0.7805
cosine_accuracy@7 0.8537
cosine_precision@3 0.2276
cosine_precision@5 0.1561
cosine_precision@7 0.122
cosine_recall@3 0.6341
cosine_recall@5 0.7317
cosine_recall@7 0.8049
cosine_ndcg@3 0.4859
cosine_ndcg@5 0.5279
cosine_ndcg@7 0.5529
cosine_mrr@10 0.4897
cosine_map@100 0.4687

Training Details

Training Dataset

Unnamed Dataset

  • Size: 402 training samples
  • Columns: text and question
  • Approximate statistics based on the first 402 samples:
    text question
    type string string
    details
    • min: 39 tokens
    • mean: 279.24 tokens
    • max: 698 tokens
    • min: 8 tokens
    • mean: 28.59 tokens
    • max: 76 tokens
  • Samples:
    text question
    <2-hop>

    4.1.1 Positive observations
    I noted the following areas where management had commendable performance;
     The water grant was incorporated into the entity's budget which was approved by Parliament/Council for release and implementation.
     I noted that 6 out of 6 (100%) of the budgeted projects were provided for in the approved five-year development plan.
     All the projects implemented were eligible.
     There was an agreement between the land owners and the community members to protect government's rights to ownership of the land where the project is being constructed.
    11
    How were fund management and budget approval handled in the Education Development grant projects?
    Auditor's Responsibilities for the audit of the Financial Statements
    From the matters communicated with the Accounting Officer, I determine those matters that were of most significance in the audit of the financial statements of the current period and are therefore the key audit matters. I describe these matters in my auditor's report unless law or regulation precludes public disclosure about the matter or when, in extremely rare circumstances, I determine that a matter should not be communicated in my report because the adverse consequences of doing so would reasonably be expected to outweigh the public interest benefits of such communication.
    What are the auditor's responsibilities regarding financial statements?
    <1-hop>

    Auditor's Responsibilities for the audit of the Financial Statements
    My objectives are to obtain reasonable assurance about whether the financial statements as a whole are free from material misstatement, whether due to fraud or error, and to issue an auditor's report that includes my opinion. Reasonable assurance is a high level of assurance but is not a guarantee that an audit conducted in accordance with ISSAIs will always detect a material misstatement, when it exists. Misstatements can arise from fraud or error and are considered material if, individually or in aggregate, they could reasonably be expected to influence the economic decisions of users, taken on the basis of these financial statements.
    As part of an audit in accordance with ISSAIs, I exercise professional judgment and maintain professional skepticism throughout the audit. I also:
     Identify and assess the risks of material misstatement of the financial statements, whether due to fraud ...
    What are the key responsibilities of an auditor in ensuring financial statements are free from material misstatement?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256
        ],
        "matryoshka_weights": [
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 64
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 64
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step dim_768_cosine_ndcg@7 dim_512_cosine_ndcg@7 dim_256_cosine_ndcg@7
1.0 1 0.5313 0.4963 0.5033
2.0 2 0.5533 0.5192 0.5376
3.0 3 0.5721 0.5729 0.5536
4.0 4 0.5671 0.5889 0.5529
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.1
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.1.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}