Instructions to use IoannisKat1/modernbert-embed-base-new2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use IoannisKat1/modernbert-embed-base-new2 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("IoannisKat1/modernbert-embed-base-new2")

sentences = [
    "Who should inform the lead supervisory authority without delay about the matter?",
    "The protection of natural persons with regard to the processing of personal data by competent authorities for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, including the safeguarding against and the prevention of threats to public security and the free movement of such data, is the subject of a specific Union legal act. This Regulation should not, therefore, apply to processing activities for those purposes. However, personal data processed by public authorities under this Regulation should, when used for those purposes, be governed by a more specific Union legal act, namely Directive (EU) 2016/680 of the European Parliament and of the Council (1). Member States may entrust competent authorities within the meaning of Directive (EU) 2016/680 with tasks which are not necessarily carried out for the purposes of the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, including the safeguarding against and prevention of threats to public security, so that the processing of personal data for those other purposes, in so far as it is within the scope of Union law, falls within the scope of this Regulation. With regard to the processing of personal data by those competent authorities for purposes falling within scope of this Regulation, Member States should be able to maintain or introduce more specific provisions to adapt the application of the rules of this Regulation. Such provisions may determine more precisely specific requirements for the processing of personal data by those competent authorities for those other purposes, taking into account the constitutional, organisational and administrative structure of the respective Member State. When the processing of personal data by private bodies falls within the scope of this Regulation, this Regulation should provide for the possibility for Member States under specific conditions to restrict by law certain obligations and rights when such a restriction constitutes a necessary and proportionate measure in a democratic society to safeguard specific important interests including public security and the prevention, investigation, detection or prosecution of criminal offences or the execution of criminal penalties, including the safeguarding against and the prevention of threats to public security. This is relevant for instance in the framework of anti-money laundering or the activities of forensic laboratories.",
    "**Court (Civil/Criminal): Civil**  \n**Provisions:**  \n**Time of commission of the act:**  \n**Outcome (not guilty, guilty):**  \n**Reasoning:** Partially accepts the lawsuit.  \n**Facts:** The plaintiff, who works as a lawyer, maintains a savings account with the defendant banking corporation under account number GR.............. Pursuant to a contract dated June 11, 2010, established in Thessaloniki between the defendant and the plaintiff, the plaintiff was granted access to the electronic banking system (e-banking) to conduct banking transactions remotely. On October 10, 2020, the plaintiff fell victim to electronic fraud through the \"phishing\" method, whereby an unknown perpetrator managed to extract and transfer €3,000.00 from the plaintiff’s account to another account of the same bank. Specifically, on that day at 6:51 a.m., the plaintiff received an email from the sender \".........\", with the address ..........., informing him that his debit card had been suspended and that online payments and cash withdrawals could not be made until the issue was resolved. The email urged him to confirm his details within the next 72 hours by following a link titled \"card activation.\"  \nThe plaintiff read the above email on his mobile phone around 8:00 a.m., and believing it came from the defendant, he followed the instructions and accessed a website that was identical (a clone) to that of the defendant. On this page, he was asked to enter his login credentials to connect to the service, which he did, and he was subsequently asked to input his debit card details for the alleged activation, which he also provided. Then, to complete the process, a number was sent to his mobile phone at 8:07 a.m. from the sender ........, which he entered, and two minutes later he received a message from the same sender in English stating that the quick access code had been activated on his mobile. A few minutes later, at 8:18 a.m., he received an email from the defendant informing him of the transfer of €3,000.00 from his account to account number GR ........... held at the same bank, with the beneficiary's details being .......... As soon as the plaintiff read this, he immediately called the defendant's call center and canceled his debit card, the access codes for the service ......., and locked the application .......... At the same time, he verbally submitted a request to dispute and cancel the contested transaction, and in a subsequent phone call, he also canceled his credit card. On the same day, he also sent an email to the defendant informing them in writing of the above and requesting the cancellation of the transaction and the return of the amount of €3,000.00 to his account, as this transfer was not made by him but by an unknown perpetrator through electronic fraud and was not approved by him. It should also be noted that the plaintiff, as the sole beneficiary according to the aforementioned contract for using the defendant's Internet Banking service, never received any update via SMS or the VIBER application from the bank regarding the transaction details before its completion, nor did he receive a one-time code (OTP) to approve the contested transaction. He subsequently filed a complaint against unknown persons at the Cyber Crime Division for the crime of fraud. The defendant sent an email to the plaintiff on October 16, 2020, informing him that his request had been forwarded to the appropriate department of the bank for investigation, stating that the bank would never send him an email or SMS asking him to enter his personal data and that as of October 7, 2020, there was a notice posted for its customers regarding malicious attempts to steal personal data in the \"Our News\" section on ....... A month after the disputed incident, on November 10, 2020, an amount of €2,296.82 was transferred to the plaintiff's account from the account to which the fraudulent credit had been made. The plaintiff immediately sent an email to the defendant asking to be informed whether this transfer was a return of part of the amount that had been illegally withdrawn from his account and requested the return of the remaining amount of €703.18. In its response dated January 13, 2021, the defendant confirmed that the aforementioned amount indeed came from the account to which the fraudulent credit had been made, following a freeze of that account initiated by the defendant during the investigation of the incident, but refused to return the remaining amount, claiming it bore no responsibility for the leak of the personal codes to third parties, according to the terms of the service contract established between them.  \nFrom the entirety of the evidence presented to the court, there is no indication of the authenticity of the contested transaction, as the plaintiff did not give his consent for the execution of the transfer of the amount of €3,000.00, especially in light of the provision in Article 72 paragraph 2 of Law 4537/2018 stating that the mere use of the Internet Banking service by the plaintiff does not necessarily constitute sufficient evidence that the payer approved the payment action. Specifically, it was proven that the contested transaction was not carried out following a strong identification of the plaintiff – the sole beneficiary of the account – and his approval, as the latter may have entered his personal codes on the counterfeit website; however, he was never informed, before the completion of the contested transaction, of the amount that would be transferred from his account to a third-party account, nor did he receive on his mobile phone, either via SMS or through the VIBER application or any other means, the one-time code - extra PIN for its completion, which he was required to enter to approve the contested transaction (payment action) and thus complete his identification, a fact that was not countered by any evidence from the defendant. Furthermore, it is noted that the defendant's claims that it bears no responsibility under the terms of the banking services contract, whereby it is not liable for any damage to its customer in cases of unauthorized use of their personal access codes to the Internet Banking service, are to be rejected as fundamentally unfounded. This is because the aforementioned contractual terms are invalid according to the provision of Article 103 of Law 4537/2018, as they contradict the provisions of Articles 71, 73, and 92 of the same Law, which provide for the provider's universal liability and its exemption only for unusual and unforeseen circumstances that are beyond the control of the party invoking them and whose consequences could not have been avoided despite all efforts to the contrary; these provisions establish mandatory law in favor of users, as according to Article 103 of Law 4537/2018, payment service providers are prohibited from deviating from the provisions to the detriment of payment service users, unless the possibility of deviation is explicitly provided and they can decide to offer only more favorable terms to payment service users; the aforementioned contractual terms do not constitute more favorable terms but rather disadvantageous terms for the payment service user. In this case, however, the defendant did not prove the authenticity of the transaction and its approval by the plaintiff and did not invoke, nor did any unusual and unforeseen circumstances beyond its control, the consequences of which could not have been avoided despite all efforts to the contrary, come to light. Therefore, the contested transaction transferring the amount of €3,000.00 is considered, in the absence of demonstrable consent from the plaintiff, unapproved according to the provisions of Article 64 of Law 4537/2018, and the defendant's contrary claims are rejected, especially since the plaintiff proceeded, according to Article 71 paragraph 1 of Law 4537/2018, without undue delay to notify the defendant regarding the contested unapproved payment action. Consequently, the defendant is liable for compensating the plaintiff for the positive damage he suffered under Article 73 of Law 4537/2018 and is obliged to pay him the requested amount of €703.18, while the plaintiff’s fault in the occurrence of this damage cannot be established, as he entered his personal details in an online environment that was a faithful imitation of that of the defendant, as evidenced by the comparison of the screenshots of the fake website and the real website provided by the plaintiff, a fact that he could not have known while being fully convinced that he was transacting with the defendant. Furthermore, the defendant’s liability to compensate the plaintiff is based on the provision of Article 8 of Law 2251/1994, which applies in this case, as the plaintiff's damage resulted from inadequate fulfillment of its obligations in the context of providing its services, but also on the provision of Article 914 of the Civil Code in the sense of omission on its part of unlawfully and culpably imposed actions. In this case, given that during the relevant period there had been a multitude of similar incidents of fraud against the defendant's customers, the latter, as a service provider to the consumer public and bearing transactional obligations of care and security towards them, displayed gross negligence regarding the security provided for electronic transaction services, which was compromised by the fraudulent theft of funds, as it did not comply with all required high-security measures for executing the contested transaction, failing to implement the strict customer identification verification process and to check the authenticity of the account to which the funds were sent, thus not assuming the suspicious nature of the transaction, did not adopt comprehensive and improved protective measures to fully protect its customers against malicious attacks and online fraud and to prevent the infiltration of unauthorized third parties, nor did it fulfill its obligations to inform, accurately inform, and warn its consumers - customers, as it failed to adequately inform them of attempts to steal their personal data through the sending of informative emails or SMS, while merely posting in a section rather than on a central banner (as it later did) does not constitute adequate information such that it meets the requirement of protecting its customers and the increased safeguarding of their interests. Although the plaintiff acted promptly and informed the defendant on the same day about the contested incident, the defendant did not act as promptly regarding the investigation of the incident and the freezing of the account that held the fraudulent credit to prevent the plaintiff's loss, but only returned part of the funds to the plaintiff a month later. This behavior, beyond being culpable due to gross negligence, was also unlawful, as it would have been illegal even without the contractual relationship, as contrary to the provisions of Law 4537/2018 and Law 2251/1994, regarding the lack of security of the services that the consumer is legitimately entitled to expect, as well as the building of trust that is essential in banking transactions, elements that it was obligated to provide within the sphere of the services offered, and contrary to the principles of good faith and commercial ethics, as crystallized in the provision of Article 288 of the Civil Code, as well as the general duty imposed by Article 914 of the Civil Code not to cause harm to another culpably. This resulted not only in positive damage to the plaintiff but also in causing him moral harm consisting of his mental distress and the disruption, agitation, and sorrow he experienced, for which he must be awarded financial compensation. Taking into account all the general circumstances of the case, the extent of the plaintiff's damage, the severity of the defendant's fault, the mental distress suffered by the plaintiff, the insecurity he felt regarding his deposits, the sorrow he experienced, and the stress caused by his financial loss, which occurred during the pandemic period when his earnings from his professional activity had significantly decreased, as well as the financial and social situation of the parties, it is the court's opinion that he should be granted, as financial compensation for his moral harm, an amount of €250.00, which is deemed reasonable and fair. Therefore, the total monetary amount that the plaintiff is entitled to for his positive damage and financial compensation for the moral harm suffered amounts to a total of (€703.18 + €250.00) = €953.18.",
    "Each supervisory authority not acting as the lead supervisory authority should be competent to handle local cases where the controller or processor is established in more than one Member State, but the subject matter of the specific processing concerns only processing carried out in a single Member State and involves only data subjects in that single Member State, for example, where the subject matter concerns the processing of employees' personal data in the specific employment context of a Member State. In such cases, the supervisory authority should inform the lead supervisory authority without delay about the matter. After being informed, the lead supervisory authority should decide, whether it will handle the case pursuant to the provision on cooperation between the lead supervisory authority and other supervisory authorities concerned (‘one-stop-shop mechanism’), or whether the supervisory authority which informed it should handle the case at local level. When deciding whether it will handle the case, the lead supervisory authority should take into account whether there is an establishment of the controller or processor in the Member State of the supervisory authority which informed it in order to ensure effective enforcement of a decision vis-à-vis the controller or processor. Where the lead supervisory authority decides to handle the case, the supervisory authority which informed it should have the 4.5.2016 L 119/23 Official Journal of the European Union EN   possibility to submit a draft for a decision, of which the lead supervisory authority should take utmost account when preparing its draft decision in that one-stop-shop mechanism."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

modernbert-embed-base

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: nomic-ai/modernbert-embed-base
Maximum Sequence Length: 8192 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Who may carry out the monitoring of compliance with a code of conduct according to Article 40?',
    '1.Without prejudice to the tasks and powers of the competent supervisory authority under Articles 57 and 58, the monitoring of compliance with a code of conduct pursuant to Article 40 may be carried out by a body which has an appropriate level of expertise in relation to the subject-matter of the code and is accredited for that purpose by the competent supervisory authority.\n2.A body as referred to in paragraph 1 may be accredited to monitor compliance with a code of conduct where that body has: (a)  demonstrated its independence and expertise in relation to the subject-matter of the code to the satisfaction of the competent supervisory authority; (b)  established procedures which allow it to assess the eligibility of controllers and processors concerned to apply the code, to monitor their compliance with its provisions and to periodically review its operation; (c)  established procedures and structures to handle complaints about infringements of the code or the manner in which the code has been, or is being, implemented by a controller or processor, and to make those procedures and structures transparent to data subjects and the public; and (d)  demonstrated to the satisfaction of the competent supervisory authority that its tasks and duties do not result in a conflict of interests.\n3.The competent supervisory authority shall submit the draft criteria for accreditation of a body as referred to in paragraph 1 of this Article to the Board pursuant to the consistency mechanism referred to in Article 63\n4.Without prejudice to the tasks and powers of the competent supervisory authority and the provisions of Chapter VIII, a body as referred to in paragraph 1 of this Article shall, subject to appropriate safeguards, take appropriate action in cases of infringement of the code by a controller or processor, including suspension or exclusion of the controller or processor concerned from the code. It shall inform the competent supervisory authority of such actions and the reasons for taking them.\n5.The competent supervisory authority shall revoke the accreditation of a body as referred to in paragraph 1 if the conditions for accreditation are not, or are no longer, met or where actions taken by the body infringe this Regulation.\n6.This Article shall not apply to processing carried out by public authorities and bodies.',
    'It should be ascertained whether all appropriate technological protection and organisational measures have been implemented to establish immediately whether a personal data breach has taken place and to inform promptly the supervisory authority and the data subject. The fact that the notification was made without undue delay should be established taking into account in particular the nature and gravity of the personal data breach and its consequences and adverse effects for the data subject. Such notification may result in an intervention of the supervisory authority in accordance with its tasks and powers laid down in this Regulation.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6245, 0.2334],
#         [0.6245, 1.0000, 0.3201],
#         [0.2334, 0.3201, 1.0000]])

Evaluation

Metrics

Information Retrieval

Dataset: dim_768
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 768
}
```

Metric	Value
cosine_accuracy@1	0.4722
cosine_accuracy@3	0.5177
cosine_accuracy@5	0.5606
cosine_accuracy@10	0.6263
cosine_precision@1	0.4722
cosine_precision@3	0.4571
cosine_precision@5	0.4192
cosine_precision@10	0.3687
cosine_recall@1	0.1074
cosine_recall@3	0.2661
cosine_recall@5	0.3441
cosine_recall@10	0.4659
cosine_ndcg@10	0.5402
cosine_mrr@10	0.5073
cosine_map@100	0.5999

Information Retrieval

Dataset: dim_512
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 512
}
```

Metric	Value
cosine_accuracy@1	0.4444
cosine_accuracy@3	0.4848
cosine_accuracy@5	0.5455
cosine_accuracy@10	0.6035
cosine_precision@1	0.4444
cosine_precision@3	0.4251
cosine_precision@5	0.3894
cosine_precision@10	0.3455
cosine_recall@1	0.107
cosine_recall@3	0.2625
cosine_recall@5	0.3362
cosine_recall@10	0.45
cosine_ndcg@10	0.5158
cosine_mrr@10	0.4801
cosine_map@100	0.585

Information Retrieval

Dataset: dim_256
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 256
}
```

Metric	Value
cosine_accuracy@1	0.4394
cosine_accuracy@3	0.4848
cosine_accuracy@5	0.5328
cosine_accuracy@10	0.596
cosine_precision@1	0.4394
cosine_precision@3	0.4242
cosine_precision@5	0.3919
cosine_precision@10	0.3424
cosine_recall@1	0.1027
cosine_recall@3	0.2525
cosine_recall@5	0.3299
cosine_recall@10	0.4382
cosine_ndcg@10	0.5075
cosine_mrr@10	0.475
cosine_map@100	0.5754

Information Retrieval

Dataset: dim_128
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 128
}
```

Metric	Value
cosine_accuracy@1	0.4091
cosine_accuracy@3	0.4495
cosine_accuracy@5	0.5025
cosine_accuracy@10	0.5606
cosine_precision@1	0.4091
cosine_precision@3	0.3872
cosine_precision@5	0.3591
cosine_precision@10	0.3124
cosine_recall@1	0.1018
cosine_recall@3	0.2438
cosine_recall@5	0.3185
cosine_recall@10	0.4262
cosine_ndcg@10	0.4788
cosine_mrr@10	0.4431
cosine_map@100	0.5381

Information Retrieval

Dataset: dim_64
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 64
}
```

Metric	Value
cosine_accuracy@1	0.3207
cosine_accuracy@3	0.3636
cosine_accuracy@5	0.4091
cosine_accuracy@10	0.4823
cosine_precision@1	0.3207
cosine_precision@3	0.3081
cosine_precision@5	0.2869
cosine_precision@10	0.2548
cosine_recall@1	0.0791
cosine_recall@3	0.1965
cosine_recall@5	0.2592
cosine_recall@10	0.3567
cosine_ndcg@10	0.389
cosine_mrr@10	0.3549
cosine_map@100	0.4498

Training Details

Training Dataset

Unnamed Dataset

Size: 1,580 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 7 tokens
mean: 15.21 tokens
max: 35 tokens

min: 25 tokens
mean: 648.23 tokens
max: 2429 tokens

	anchor	positive
type	string	string
details	min: 7 tokens mean: 15.21 tokens max: 35 tokens	min: 25 tokens mean: 648.23 tokens max: 2429 tokens

Samples:

anchor	positive
`What bodies or sources shall the Commission take into account?`	1.By 25 May 2020 and every four years thereafter, the Commission shall submit a report on the evaluation and review of this Regulation to the European Parliament and to the Council. The reports shall be made public. 2.In the context of the evaluations and reviews referred to in paragraph 1, the Commission shall examine, in particular, the application and functioning of: (a) Chapter V on the transfer of personal data to third countries or international organisations with particular regard to decisions adopted pursuant to Article 45(3) of this Regulation and decisions adopted on the basis of Article 25(6) of Directive 95/46/EC; (b) Chapter VII on cooperation and consistency. 3.For the purpose of paragraph 1, the Commission may request information from Member States and supervisory authorities. 4.In carrying out the evaluations and reviews referred to in paragraphs 1 and 2, the Commission shall take into account the positions and findings of the European Parliament, of the Council, and ...
`What enables researchers within social science to obtain essential knowledge about the long-term correlation of social conditions?`	By coupling information from registries, researchers can obtain new knowledge of great value with regard to widespread medical conditions such as cardiovascular disease, cancer and depression. On the basis of registries, research results can be enhanced, as they draw on a larger population. Within social science, research on the basis of registries enables researchers to obtain essential knowledge about the long-term correlation of a number of social conditions such as unemployment and education with other life conditions. Research results obtained through registries provide solid, high-quality knowledge which can provide the basis for the formulation and implementation of knowledge-based policy, improve the quality of life for a number of people and improve the efficiency of social services. In order to facilitate scientific research, personal data can be processed for scientific research purposes, subject to appropriate conditions and safeguards set out in Union or Member State law.
`What is the article that pertains to approving binding corporate rules?`	1.Each supervisory authority shall have all of the following investigative powers: (a) to order the controller and the processor, and, where applicable, the controller's or the processor's representative to provide any information it requires for the performance of its tasks; (b) to carry out investigations in the form of data protection audits; (c) to carry out a review on certifications issued pursuant to Article 42(7); (d) to notify the controller or the processor of an alleged infringement of this Regulation; (e) to obtain, from the controller and the processor, access to all personal data and to all information necessary for the performance of its tasks; (f) to obtain access to any premises of the controller and the processor, including to any data processing equipment and means, in accordance with Union or Member State procedural law. 2.Each supervisory authority shall have all of the following corrective powers: (a) to issue warnings to a controller or processor that inte...

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: epoch
gradient_accumulation_steps: 4
learning_rate: 3e-05
num_train_epochs: 20
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: True
load_best_model_at_end: True
optim: adamw_torch_fused
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 4
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 3e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 20
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss	dim_768_cosine_ndcg@10	dim_512_cosine_ndcg@10	dim_256_cosine_ndcg@10	dim_128_cosine_ndcg@10	dim_64_cosine_ndcg@10
-1	-1	-	0.3515	0.3509	0.3285	0.3016	0.2617
0.2020	10	20.9258	-	-	-	-	-
0.4040	20	20.6577	-	-	-	-	-
0.6061	30	20.6479	-	-	-	-	-
0.8081	40	21.0398	-	-	-	-	-
1.0	50	20.2131	0.3647	0.3809	0.3475	0.3206	0.2865
1.2020	60	19.2345	-	-	-	-	-
1.4040	70	18.6065	-	-	-	-	-
1.6061	80	16.8382	-	-	-	-	-
1.8081	90	17.4581	-	-	-	-	-
2.0	100	16.8996	0.4571	0.4535	0.4513	0.4101	0.3576
2.2020	110	17.4694	-	-	-	-	-
2.4040	120	14.7442	-	-	-	-	-
2.6061	130	12.601	-	-	-	-	-
2.8081	140	13.037	-	-	-	-	-
3.0	150	13.0811	0.4993	0.5003	0.4866	0.4555	0.3709
3.2020	160	11.8374	-	-	-	-	-
3.4040	170	12.5389	-	-	-	-	-
3.6061	180	14.3829	-	-	-	-	-
3.8081	190	13.8871	-	-	-	-	-
4.0	200	10.3684	0.5054	0.5020	0.4947	0.4597	0.3739
4.2020	210	12.6792	-	-	-	-	-
4.4040	220	10.6044	-	-	-	-	-
4.6061	230	12.015	-	-	-	-	-
4.8081	240	10.7804	-	-	-	-	-
5.0	250	9.439	0.5190	0.5098	0.5063	0.4589	0.3753
5.2020	260	10.8849	-	-	-	-	-
5.4040	270	11.2237	-	-	-	-	-
5.6061	280	9.7149	-	-	-	-	-
5.8081	290	10.5259	-	-	-	-	-
6.0	300	9.1578	0.5227	0.5169	0.5062	0.4667	0.3777
6.2020	310	10.6102	-	-	-	-	-
6.4040	320	10.1176	-	-	-	-	-
6.6061	330	8.3092	-	-	-	-	-
6.8081	340	9.5087	-	-	-	-	-
7.0	350	11.525	0.5252	0.5144	0.5092	0.4747	0.3706
7.2020	360	10.3263	-	-	-	-	-
7.4040	370	9.7615	-	-	-	-	-
7.6061	380	9.1261	-	-	-	-	-
7.8081	390	9.6996	-	-	-	-	-
8.0	400	8.4646	0.5324	0.5158	0.5082	0.4759	0.3719
8.2020	410	9.6561	-	-	-	-	-
8.4040	420	9.504	-	-	-	-	-
8.6061	430	7.4925	-	-	-	-	-
8.8081	440	8.749	-	-	-	-	-
9.0	450	9.5831	0.5282	0.5215	0.5038	0.4741	0.3721
9.2020	460	8.5261	-	-	-	-	-
9.4040	470	9.2267	-	-	-	-	-
9.6061	480	8.3529	-	-	-	-	-
9.8081	490	8.391	-	-	-	-	-
10.0	500	9.2313	0.5374	0.5219	0.5093	0.4768	0.3749
10.2020	510	10.6238	-	-	-	-	-
10.4040	520	8.9972	-	-	-	-	-
10.6061	530	8.0452	-	-	-	-	-
10.8081	540	8.2937	-	-	-	-	-
11.0	550	8.0842	0.5402	0.5158	0.5075	0.4788	0.389
11.2020	560	7.9855	-	-	-	-	-
11.4040	570	9.1783	-	-	-	-	-
11.6061	580	8.5681	-	-	-	-	-
11.8081	590	9.0004	-	-	-	-	-
12.0	600	7.8016	0.5402	0.5199	0.5078	0.4745	0.3836
12.2020	610	8.1169	-	-	-	-	-
12.4040	620	8.7016	-	-	-	-	-
12.6061	630	8.6899	-	-	-	-	-
12.8081	640	8.1782	-	-	-	-	-
13.0	650	7.8024	0.5361	0.5178	0.5065	0.4751	0.3864
-1	-1	-	0.5402	0.5158	0.5075	0.4788	0.3890

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.12.11
Sentence Transformers: 5.1.0
Transformers: 4.51.3
PyTorch: 2.8.0+cu126
Accelerate: 1.10.1
Datasets: 4.0.0
Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

BF16

Model tree for IoannisKat1/modernbert-embed-base-new2

Base model

answerdotai/ModernBERT-base

Finetuned

nomic-ai/modernbert-embed-base

Finetuned

(111)

this model

Papers for IoannisKat1/modernbert-embed-base-new2

Evaluation results

Cosine Accuracy@1 on dim 768
self-reported

0.472
Cosine Accuracy@3 on dim 768
self-reported

0.518
Cosine Accuracy@5 on dim 768
self-reported

0.561
Cosine Accuracy@10 on dim 768
self-reported

0.626
Cosine Precision@1 on dim 768
self-reported

0.472
Cosine Precision@3 on dim 768
self-reported

0.457
Cosine Precision@5 on dim 768
self-reported

0.419
Cosine Precision@10 on dim 768
self-reported

0.369