modernbert-embed-base

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/modernbert-embed-base
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What may impede authorities in the discharge of their responsibilities under Union law?',
    'The objectives and principles of Directive 95/46/EC remain sound, but it has not prevented fragmentation in the implementation of data protection across the Union, legal uncertainty or a widespread public perception that there are significant risks to the protection of natural persons, in particular with regard to online activity. Differences in the level of protection of the rights and freedoms of natural persons, in particular the right to the protection of personal data, with regard to the processing of personal data in the Member States may prevent the free flow of personal data throughout the Union. Those differences may therefore constitute an obstacle to the pursuit of economic activities at the level of the Union, distort competition and impede authorities in the discharge of their responsibilities under Union law. Such a difference in levels of protection is due to the existence of differences in the implementation and application of Directive 95/46/EC.',
    'This Regulation is without prejudice to international agreements concluded between the Union and third countries regulating the transfer of personal data including appropriate safeguards for the data subjects. Member States may conclude international agreements which involve the transfer of personal data to third countries or international organisations, as far as such agreements do not affect this Regulation or any other provisions of Union law and include an appropriate level of protection for the fundamental rights of the data subjects.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5042, 0.0865],
#         [0.5042, 1.0000, 0.2632],
#         [0.0865, 0.2632, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.402
cosine_accuracy@3 0.4052
cosine_accuracy@5 0.4289
cosine_accuracy@10 0.4609
cosine_precision@1 0.402
cosine_precision@3 0.4012
cosine_precision@5 0.3913
cosine_precision@10 0.359
cosine_recall@1 0.0418
cosine_recall@3 0.1228
cosine_recall@5 0.1854
cosine_recall@10 0.2777
cosine_ndcg@10 0.422
cosine_mrr@10 0.4118
cosine_map@100 0.4808

Information Retrieval

Metric Value
cosine_accuracy@1 0.3944
cosine_accuracy@3 0.3988
cosine_accuracy@5 0.4181
cosine_accuracy@10 0.4533
cosine_precision@1 0.3944
cosine_precision@3 0.3944
cosine_precision@5 0.3841
cosine_precision@10 0.3526
cosine_recall@1 0.0404
cosine_recall@3 0.1197
cosine_recall@5 0.1811
cosine_recall@10 0.2725
cosine_ndcg@10 0.414
cosine_mrr@10 0.4041
cosine_map@100 0.4723

Information Retrieval

Metric Value
cosine_accuracy@1 0.386
cosine_accuracy@3 0.3924
cosine_accuracy@5 0.4168
cosine_accuracy@10 0.4481
cosine_precision@1 0.386
cosine_precision@3 0.3867
cosine_precision@5 0.3784
cosine_precision@10 0.3477
cosine_recall@1 0.0396
cosine_recall@3 0.1174
cosine_recall@5 0.1784
cosine_recall@10 0.2681
cosine_ndcg@10 0.4084
cosine_mrr@10 0.3969
cosine_map@100 0.4643

Information Retrieval

Metric Value
cosine_accuracy@1 0.3534
cosine_accuracy@3 0.3598
cosine_accuracy@5 0.3848
cosine_accuracy@10 0.4142
cosine_precision@1 0.3534
cosine_precision@3 0.3538
cosine_precision@5 0.3461
cosine_precision@10 0.3195
cosine_recall@1 0.0365
cosine_recall@3 0.1076
cosine_recall@5 0.163
cosine_recall@10 0.2478
cosine_ndcg@10 0.3761
cosine_mrr@10 0.3641
cosine_map@100 0.4332

Information Retrieval

Metric Value
cosine_accuracy@1 0.3079
cosine_accuracy@3 0.3156
cosine_accuracy@5 0.3348
cosine_accuracy@10 0.3694
cosine_precision@1 0.3079
cosine_precision@3 0.3092
cosine_precision@5 0.3027
cosine_precision@10 0.2804
cosine_recall@1 0.0315
cosine_recall@3 0.0937
cosine_recall@5 0.1426
cosine_recall@10 0.2173
cosine_ndcg@10 0.3297
cosine_mrr@10 0.3185
cosine_map@100 0.3854

Training Details

Training Dataset

Unnamed Dataset

  • Size: 391 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 391 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 15.05 tokens
    • max: 30 tokens
    • min: 25 tokens
    • mean: 667.99 tokens
    • max: 2429 tokens
  • Samples:
    anchor positive
    On what date did the act occur? Court (Civil/Criminal): Civil
    Provisions: Directive 2015/366, Law 4537/2018
    Time of the act: 31.08.2022
    Outcome (not guilty, guilty): Partially accepts the claim.
    Reasoning: The Athens Peace Court ordered the bank to return the amount that was withdrawn from the plaintiffs' account and to pay additional compensation for the moral damage they suffered.
    Facts: The case concerns plaintiffs who fell victim to electronic fraud via phishing, resulting in the withdrawal of money from their bank account. The plaintiffs claimed that the bank did not take the necessary security measures to protect their accounts and sought compensation for the financial loss and moral damage they suffered. The court determined that the bank is responsible for the loss of the money, as it did not prove that the transactions were authorized by the plaintiffs. Furthermore, the court recognized that the bank's refusal to return the funds constitutes an infringement of the plaintiffs' personal rights, as it...
    For what purposes can more specific rules be provided regarding the employment context? 1.Member States may, by law or by collective agreements, provide for more specific rules to ensure the protection of the rights and freedoms in respect of the processing of employees' personal data in the employment context, in particular for the purposes of the recruitment, the performance of the contract of employment, including discharge of obligations laid down by law or by collective agreements, management, planning and organisation of work, equality and diversity in the workplace, health and safety at work, protection of employer's or customer's property and for the purposes of the exercise and enjoyment, on an individual or collective basis, of rights and benefits related to employment, and for the purpose of the termination of the employment relationship.
    2.Those rules shall include suitable and specific measures to safeguard the data subject's human dignity, legitimate interests and fundamental rights, with particular regard to the transparency of processing, the transfer of p...
    On which date were transactions detailed in the provided text conducted? Court (Civil/Criminal): Civil

    Provisions:

    Time of commission of the act:

    Outcome (not guilty, guilty):

    Rationale:

    Facts:
    The plaintiff holds credit card number ............ with the defendant banking corporation. Based on the application for alternative networks dated 19/7/2015 with number ......... submitted at a branch of the defendant, he was granted access to the electronic banking service (e-banking) to conduct banking transactions (debit, credit, updates, payments) remotely. On 30/11/2020, the plaintiff fell victim to electronic fraud through the "phishing" method, whereby an unknown perpetrator managed to withdraw a total amount of €3,121.75 from the aforementioned credit card. Specifically, the plaintiff received an email at 1:35 PM on 29/11/2020 from sender ...... with address ........, informing him that due to an impending system change, he needed to verify the mobile phone number linked to the credit card, urging him to complete the verification...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 20
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.0102 1 0.0001 - - - - -
0.0204 2 0.001 - - - - -
0.0306 3 0.0938 - - - - -
0.0408 4 0.0084 - - - - -
0.0510 5 0.0 - - - - -
0.0612 6 0.0004 - - - - -
0.0714 7 0.003 - - - - -
0.0816 8 0.0012 - - - - -
0.0918 9 0.0001 - - - - -
0.1020 10 0.0053 - - - - -
0.1122 11 0.0068 - - - - -
0.1224 12 0.0006 - - - - -
0.1327 13 0.0007 - - - - -
0.1429 14 0.0003 - - - - -
0.1531 15 0.0096 - - - - -
0.1633 16 0.0004 - - - - -
0.1735 17 0.016 - - - - -
0.1837 18 0.0 - - - - -
0.1939 19 0.0005 - - - - -
0.2041 20 0.0 - - - - -
0.2143 21 0.003 - - - - -
0.2245 22 0.1395 - - - - -
0.2347 23 0.3967 - - - - -
0.2449 24 0.0023 - - - - -
0.2551 25 0.0003 - - - - -
0.2653 26 0.0027 - - - - -
0.2755 27 0.0147 - - - - -
0.2857 28 0.0522 - - - - -
0.2959 29 0.0001 - - - - -
0.3061 30 0.0008 - - - - -
0.3163 31 0.0044 - - - - -
0.3265 32 0.0 - - - - -
0.3367 33 0.0028 - - - - -
0.3469 34 0.0007 - - - - -
0.3571 35 0.0002 - - - - -
0.3673 36 0.0168 - - - - -
0.3776 37 0.0023 - - - - -
0.3878 38 0.0041 - - - - -
0.3980 39 0.0081 - - - - -
0.4082 40 0.0004 - - - - -
0.4184 41 0.0 - - - - -
0.4286 42 0.005 - - - - -
0.4388 43 0.0031 - - - - -
0.4490 44 0.0216 - - - - -
0.4592 45 0.0004 - - - - -
0.4694 46 0.0018 - - - - -
0.4796 47 0.0 - - - - -
0.4898 48 0.0044 - - - - -
0.5 49 0.0004 - - - - -
0.5102 50 0.0019 - - - - -
0.5204 51 0.0005 - - - - -
0.5306 52 0.0016 - - - - -
0.5408 53 0.1806 - - - - -
0.5510 54 0.0 - - - - -
0.5612 55 0.0025 - - - - -
0.5714 56 0.0002 - - - - -
0.5816 57 0.0 - - - - -
0.5918 58 0.0111 - - - - -
0.6020 59 0.0011 - - - - -
0.6122 60 0.0003 - - - - -
0.6224 61 1.8072 - - - - -
0.6327 62 0.0009 - - - - -
0.6429 63 0.0011 - - - - -
0.6531 64 0.0013 - - - - -
0.6633 65 0.0 - - - - -
0.6735 66 0.0007 - - - - -
0.6837 67 0.4116 - - - - -
0.6939 68 0.008 - - - - -
0.7041 69 0.0009 - - - - -
0.7143 70 0.0004 - - - - -
0.7245 71 0.0019 - - - - -
0.7347 72 0.0005 - - - - -
0.7449 73 0.0004 - - - - -
0.7551 74 0.0005 - - - - -
0.7653 75 0.0001 - - - - -
0.7755 76 0.0005 - - - - -
0.7857 77 0.0 - - - - -
0.7959 78 0.0001 - - - - -
0.8061 79 0.0025 - - - - -
0.8163 80 0.0 - - - - -
0.8265 81 0.0012 - - - - -
0.8367 82 0.0003 - - - - -
0.8469 83 0.0002 - - - - -
0.8571 84 0.0 - - - - -
0.8673 85 0.0 - - - - -
0.8776 86 0.0 - - - - -
0.8878 87 0.0002 - - - - -
0.8980 88 0.0009 - - - - -
0.9082 89 0.0067 - - - - -
0.9184 90 0.0 - - - - -
0.9286 91 0.0001 - - - - -
0.9388 92 0.0008 - - - - -
0.9490 93 0.0031 - - - - -
0.9592 94 0.0004 - - - - -
0.9694 95 0.0004 - - - - -
0.9796 96 0.0001 - - - - -
0.9898 97 0.0004 - - - - -
1.0 98 0.0005 0.4261 0.4154 0.4098 0.379 0.3357
1.0102 99 0.0006 - - - - -
1.0204 100 0.0011 - - - - -
1.0306 101 0.0006 - - - - -
1.0408 102 0.0 - - - - -
1.0510 103 0.0009 - - - - -
1.0612 104 0.0008 - - - - -
1.0714 105 0.0004 - - - - -
1.0816 106 0.0 - - - - -
1.0918 107 0.0005 - - - - -
1.1020 108 0.0007 - - - - -
1.1122 109 0.0003 - - - - -
1.1224 110 0.0001 - - - - -
1.1327 111 0.0001 - - - - -
1.1429 112 0.0006 - - - - -
1.1531 113 0.0005 - - - - -
1.1633 114 0.0013 - - - - -
1.1735 115 0.0 - - - - -
1.1837 116 0.0003 - - - - -
1.1939 117 0.0001 - - - - -
1.2041 118 0.0003 - - - - -
1.2143 119 0.001 - - - - -
1.2245 120 0.0 - - - - -
1.2347 121 0.0 - - - - -
1.2449 122 0.0001 - - - - -
1.2551 123 0.0011 - - - - -
1.2653 124 0.0019 - - - - -
1.2755 125 0.0 - - - - -
1.2857 126 0.0004 - - - - -
1.2959 127 0.0 - - - - -
1.3061 128 0.0 - - - - -
1.3163 129 0.0002 - - - - -
1.3265 130 0.0004 - - - - -
1.3367 131 0.0012 - - - - -
1.3469 132 0.0002 - - - - -
1.3571 133 0.0001 - - - - -
1.3673 134 0.0001 - - - - -
1.3776 135 0.0001 - - - - -
1.3878 136 0.0001 - - - - -
1.3980 137 0.0002 - - - - -
1.4082 138 0.0002 - - - - -
1.4184 139 0.0003 - - - - -
1.4286 140 0.0001 - - - - -
1.4388 141 0.0003 - - - - -
1.4490 142 0.0023 - - - - -
1.4592 143 0.0008 - - - - -
1.4694 144 0.0004 - - - - -
1.4796 145 0.0009 - - - - -
1.4898 146 0.0002 - - - - -
1.5 147 0.0 - - - - -
1.5102 148 0.0001 - - - - -
1.5204 149 0.0002 - - - - -
1.5306 150 0.0002 - - - - -
1.5408 151 0.0001 - - - - -
1.5510 152 0.0005 - - - - -
1.5612 153 0.0 - - - - -
1.5714 154 0.0001 - - - - -
1.5816 155 0.0003 - - - - -
1.5918 156 0.0001 - - - - -
1.6020 157 0.0006 - - - - -
1.6122 158 0.0002 - - - - -
1.6224 159 0.0201 - - - - -
1.6327 160 0.0003 - - - - -
1.6429 161 0.0003 - - - - -
1.6531 162 0.0001 - - - - -
1.6633 163 0.6487 - - - - -
1.6735 164 0.0013 - - - - -
1.6837 165 0.0 - - - - -
1.6939 166 0.0001 - - - - -
1.7041 167 0.0003 - - - - -
1.7143 168 0.0 - - - - -
1.7245 169 0.0001 - - - - -
1.7347 170 0.0 - - - - -
1.7449 171 0.0001 - - - - -
1.7551 172 0.0001 - - - - -
1.7653 173 0.0 - - - - -
1.7755 174 0.0001 - - - - -
1.7857 175 0.0001 - - - - -
1.7959 176 0.0006 - - - - -
1.8061 177 0.0006 - - - - -
1.8163 178 0.0001 - - - - -
1.8265 179 0.0026 - - - - -
1.8367 180 0.0003 - - - - -
1.8469 181 0.0001 - - - - -
1.8571 182 0.0003 - - - - -
1.8673 183 0.0068 - - - - -
1.8776 184 0.0004 - - - - -
1.8878 185 0.0 - - - - -
1.8980 186 0.0002 - - - - -
1.9082 187 0.0004 - - - - -
1.9184 188 0.0 - - - - -
1.9286 189 0.0002 - - - - -
1.9388 190 0.0002 - - - - -
1.9490 191 0.0001 - - - - -
1.9592 192 0.0 - - - - -
1.9694 193 0.0005 - - - - -
1.9796 194 0.0 - - - - -
1.9898 195 0.0002 - - - - -
2.0 196 0.0 0.4021 0.4038 0.4032 0.3706 0.3269
2.0102 197 0.0038 - - - - -
2.0204 198 0.0002 - - - - -
2.0306 199 0.3615 - - - - -
2.0408 200 0.0003 - - - - -
2.0510 201 0.0001 - - - - -
2.0612 202 0.0013 - - - - -
2.0714 203 0.0018 - - - - -
2.0816 204 0.0003 - - - - -
2.0918 205 0.0012 - - - - -
2.1020 206 0.0186 - - - - -
2.1122 207 0.0002 - - - - -
2.1224 208 0.0 - - - - -
2.1327 209 0.0 - - - - -
2.1429 210 0.0029 - - - - -
2.1531 211 0.0037 - - - - -
2.1633 212 0.0001 - - - - -
2.1735 213 0.0005 - - - - -
2.1837 214 0.0032 - - - - -
2.1939 215 0.0005 - - - - -
2.2041 216 0.0069 - - - - -
2.2143 217 0.0063 - - - - -
2.2245 218 0.0027 - - - - -
2.2347 219 0.0003 - - - - -
2.2449 220 0.0015 - - - - -
2.2551 221 0.0382 - - - - -
2.2653 222 0.0012 - - - - -
2.2755 223 0.0001 - - - - -
2.2857 224 0.007 - - - - -
2.2959 225 0.0 - - - - -
2.3061 226 0.0001 - - - - -
2.3163 227 0.0 - - - - -
2.3265 228 0.0003 - - - - -
2.3367 229 0.0001 - - - - -
2.3469 230 0.0013 - - - - -
2.3571 231 0.0038 - - - - -
2.3673 232 0.0161 - - - - -
2.3776 233 0.0 - - - - -
2.3878 234 0.0001 - - - - -
2.3980 235 0.0011 - - - - -
2.4082 236 0.0209 - - - - -
2.4184 237 0.0001 - - - - -
2.4286 238 0.0001 - - - - -
2.4388 239 1.2667 - - - - -
2.4490 240 0.0025 - - - - -
2.4592 241 0.023 - - - - -
2.4694 242 0.0001 - - - - -
2.4796 243 0.0 - - - - -
2.4898 244 0.0002 - - - - -
2.5 245 0.0037 - - - - -
2.5102 246 5.2145 - - - - -
2.5204 247 0.0072 - - - - -
2.5306 248 0.0006 - - - - -
2.5408 249 0.162 - - - - -
2.5510 250 0.0043 - - - - -
2.5612 251 0.0004 - - - - -
2.5714 252 0.0006 - - - - -
2.5816 253 0.0079 - - - - -
2.5918 254 0.002 - - - - -
2.6020 255 0.0003 - - - - -
2.6122 256 0.0003 - - - - -
2.6224 257 0.0046 - - - - -
2.6327 258 0.0002 - - - - -
2.6429 259 0.0001 - - - - -
2.6531 260 0.0001 - - - - -
2.6633 261 0.0118 - - - - -
2.6735 262 0.0 - - - - -
2.6837 263 0.0001 - - - - -
2.6939 264 0.0746 - - - - -
2.7041 265 0.0007 - - - - -
2.7143 266 0.0009 - - - - -
2.7245 267 0.0005 - - - - -
2.7347 268 0.8332 - - - - -
2.7449 269 0.0002 - - - - -
2.7551 270 0.0001 - - - - -
2.7653 271 0.0013 - - - - -
2.7755 272 0.0002 - - - - -
2.7857 273 0.0002 - - - - -
2.7959 274 0.0001 - - - - -
2.8061 275 0.0 - - - - -
2.8163 276 0.0008 - - - - -
2.8265 277 0.0001 - - - - -
2.8367 278 0.0008 - - - - -
2.8469 279 0.0077 - - - - -
2.8571 280 0.0078 - - - - -
2.8673 281 0.0021 - - - - -
2.8776 282 0.0 - - - - -
2.8878 283 0.5116 - - - - -
2.8980 284 0.0015 - - - - -
2.9082 285 0.0014 - - - - -
2.9184 286 0.0002 - - - - -
2.9286 287 0.0002 - - - - -
2.9388 288 0.0041 - - - - -
2.9490 289 0.0058 - - - - -
2.9592 290 0.0001 - - - - -
2.9694 291 0.0009 - - - - -
2.9796 292 0.0001 - - - - -
2.9898 293 0.0 - - - - -
3.0 294 0.0004 0.4220 0.4140 0.4084 0.3761 0.3297
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for IoannisKat1/modernbert-embed-base-new

Finetuned
(94)
this model

Papers for IoannisKat1/modernbert-embed-base-new

Evaluation results