modernbert-embed-base
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: nomic-ai/modernbert-embed-base
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence_transformers_model_id")
sentences = [
'Who may carry out the monitoring of compliance with a code of conduct according to Article 40?',
'1.Without prejudice to the tasks and powers of the competent supervisory authority under Articles 57 and 58, the monitoring of compliance with a code of conduct pursuant to Article 40 may be carried out by a body which has an appropriate level of expertise in relation to the subject-matter of the code and is accredited for that purpose by the competent supervisory authority.\n2.A body as referred to in paragraph 1 may be accredited to monitor compliance with a code of conduct where that body has: (a) demonstrated its independence and expertise in relation to the subject-matter of the code to the satisfaction of the competent supervisory authority; (b) established procedures which allow it to assess the eligibility of controllers and processors concerned to apply the code, to monitor their compliance with its provisions and to periodically review its operation; (c) established procedures and structures to handle complaints about infringements of the code or the manner in which the code has been, or is being, implemented by a controller or processor, and to make those procedures and structures transparent to data subjects and the public; and (d) demonstrated to the satisfaction of the competent supervisory authority that its tasks and duties do not result in a conflict of interests.\n3.The competent supervisory authority shall submit the draft criteria for accreditation of a body as referred to in paragraph 1 of this Article to the Board pursuant to the consistency mechanism referred to in Article 63\n4.Without prejudice to the tasks and powers of the competent supervisory authority and the provisions of Chapter VIII, a body as referred to in paragraph 1 of this Article shall, subject to appropriate safeguards, take appropriate action in cases of infringement of the code by a controller or processor, including suspension or exclusion of the controller or processor concerned from the code. It shall inform the competent supervisory authority of such actions and the reasons for taking them.\n5.The competent supervisory authority shall revoke the accreditation of a body as referred to in paragraph 1 if the conditions for accreditation are not, or are no longer, met or where actions taken by the body infringe this Regulation.\n6.This Article shall not apply to processing carried out by public authorities and bodies.',
'It should be ascertained whether all appropriate technological protection and organisational measures have been implemented to establish immediately whether a personal data breach has taken place and to inform promptly the supervisory authority and the data subject. The fact that the notification was made without undue delay should be established taking into account in particular the nature and gravity of the personal data breach and its consequences and adverse effects for the data subject. Such notification may result in an intervention of the supervisory authority in accordance with its tasks and powers laid down in this Regulation.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities)
Evaluation
Metrics
Information Retrieval
| Metric |
Value |
| cosine_accuracy@1 |
0.4722 |
| cosine_accuracy@3 |
0.5177 |
| cosine_accuracy@5 |
0.5606 |
| cosine_accuracy@10 |
0.6263 |
| cosine_precision@1 |
0.4722 |
| cosine_precision@3 |
0.4571 |
| cosine_precision@5 |
0.4192 |
| cosine_precision@10 |
0.3687 |
| cosine_recall@1 |
0.1074 |
| cosine_recall@3 |
0.2661 |
| cosine_recall@5 |
0.3441 |
| cosine_recall@10 |
0.4659 |
| cosine_ndcg@10 |
0.5402 |
| cosine_mrr@10 |
0.5073 |
| cosine_map@100 |
0.5999 |
Information Retrieval
| Metric |
Value |
| cosine_accuracy@1 |
0.4444 |
| cosine_accuracy@3 |
0.4848 |
| cosine_accuracy@5 |
0.5455 |
| cosine_accuracy@10 |
0.6035 |
| cosine_precision@1 |
0.4444 |
| cosine_precision@3 |
0.4251 |
| cosine_precision@5 |
0.3894 |
| cosine_precision@10 |
0.3455 |
| cosine_recall@1 |
0.107 |
| cosine_recall@3 |
0.2625 |
| cosine_recall@5 |
0.3362 |
| cosine_recall@10 |
0.45 |
| cosine_ndcg@10 |
0.5158 |
| cosine_mrr@10 |
0.4801 |
| cosine_map@100 |
0.585 |
Information Retrieval
| Metric |
Value |
| cosine_accuracy@1 |
0.4394 |
| cosine_accuracy@3 |
0.4848 |
| cosine_accuracy@5 |
0.5328 |
| cosine_accuracy@10 |
0.596 |
| cosine_precision@1 |
0.4394 |
| cosine_precision@3 |
0.4242 |
| cosine_precision@5 |
0.3919 |
| cosine_precision@10 |
0.3424 |
| cosine_recall@1 |
0.1027 |
| cosine_recall@3 |
0.2525 |
| cosine_recall@5 |
0.3299 |
| cosine_recall@10 |
0.4382 |
| cosine_ndcg@10 |
0.5075 |
| cosine_mrr@10 |
0.475 |
| cosine_map@100 |
0.5754 |
Information Retrieval
| Metric |
Value |
| cosine_accuracy@1 |
0.4091 |
| cosine_accuracy@3 |
0.4495 |
| cosine_accuracy@5 |
0.5025 |
| cosine_accuracy@10 |
0.5606 |
| cosine_precision@1 |
0.4091 |
| cosine_precision@3 |
0.3872 |
| cosine_precision@5 |
0.3591 |
| cosine_precision@10 |
0.3124 |
| cosine_recall@1 |
0.1018 |
| cosine_recall@3 |
0.2438 |
| cosine_recall@5 |
0.3185 |
| cosine_recall@10 |
0.4262 |
| cosine_ndcg@10 |
0.4788 |
| cosine_mrr@10 |
0.4431 |
| cosine_map@100 |
0.5381 |
Information Retrieval
| Metric |
Value |
| cosine_accuracy@1 |
0.3207 |
| cosine_accuracy@3 |
0.3636 |
| cosine_accuracy@5 |
0.4091 |
| cosine_accuracy@10 |
0.4823 |
| cosine_precision@1 |
0.3207 |
| cosine_precision@3 |
0.3081 |
| cosine_precision@5 |
0.2869 |
| cosine_precision@10 |
0.2548 |
| cosine_recall@1 |
0.0791 |
| cosine_recall@3 |
0.1965 |
| cosine_recall@5 |
0.2592 |
| cosine_recall@10 |
0.3567 |
| cosine_ndcg@10 |
0.389 |
| cosine_mrr@10 |
0.3549 |
| cosine_map@100 |
0.4498 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,580 training samples
- Columns:
anchor and positive
- Approximate statistics based on the first 1000 samples:
|
anchor |
positive |
| type |
string |
string |
| details |
- min: 7 tokens
- mean: 15.21 tokens
- max: 35 tokens
|
- min: 25 tokens
- mean: 648.23 tokens
- max: 2429 tokens
|
- Samples:
| anchor |
positive |
What bodies or sources shall the Commission take into account? |
1.By 25 May 2020 and every four years thereafter, the Commission shall submit a report on the evaluation and review of this Regulation to the European Parliament and to the Council. The reports shall be made public. 2.In the context of the evaluations and reviews referred to in paragraph 1, the Commission shall examine, in particular, the application and functioning of: (a) Chapter V on the transfer of personal data to third countries or international organisations with particular regard to decisions adopted pursuant to Article 45(3) of this Regulation and decisions adopted on the basis of Article 25(6) of Directive 95/46/EC; (b) Chapter VII on cooperation and consistency. 3.For the purpose of paragraph 1, the Commission may request information from Member States and supervisory authorities. 4.In carrying out the evaluations and reviews referred to in paragraphs 1 and 2, the Commission shall take into account the positions and findings of the European Parliament, of the Council, and ... |
What enables researchers within social science to obtain essential knowledge about the long-term correlation of social conditions? |
By coupling information from registries, researchers can obtain new knowledge of great value with regard to widespread medical conditions such as cardiovascular disease, cancer and depression. On the basis of registries, research results can be enhanced, as they draw on a larger population. Within social science, research on the basis of registries enables researchers to obtain essential knowledge about the long-term correlation of a number of social conditions such as unemployment and education with other life conditions. Research results obtained through registries provide solid, high-quality knowledge which can provide the basis for the formulation and implementation of knowledge-based policy, improve the quality of life for a number of people and improve the efficiency of social services. In order to facilitate scientific research, personal data can be processed for scientific research purposes, subject to appropriate conditions and safeguards set out in Union or Member State law. |
What is the article that pertains to approving binding corporate rules? |
1.Each supervisory authority shall have all of the following investigative powers: (a) to order the controller and the processor, and, where applicable, the controller's or the processor's representative to provide any information it requires for the performance of its tasks; (b) to carry out investigations in the form of data protection audits; (c) to carry out a review on certifications issued pursuant to Article 42(7); (d) to notify the controller or the processor of an alleged infringement of this Regulation; (e) to obtain, from the controller and the processor, access to all personal data and to all information necessary for the performance of its tasks; (f) to obtain access to any premises of the controller and the processor, including to any data processing equipment and means, in accordance with Union or Member State procedural law. 2.Each supervisory authority shall have all of the following corrective powers: (a) to issue warnings to a controller or processor that inte... |
- Loss:
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: epoch
gradient_accumulation_steps: 4
learning_rate: 3e-05
num_train_epochs: 20
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: True
load_best_model_at_end: True
optim: adamw_torch_fused
batch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 4
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 3e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 20
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}
Training Logs
| Epoch |
Step |
Training Loss |
dim_768_cosine_ndcg@10 |
dim_512_cosine_ndcg@10 |
dim_256_cosine_ndcg@10 |
dim_128_cosine_ndcg@10 |
dim_64_cosine_ndcg@10 |
| -1 |
-1 |
- |
0.3515 |
0.3509 |
0.3285 |
0.3016 |
0.2617 |
| 0.2020 |
10 |
20.9258 |
- |
- |
- |
- |
- |
| 0.4040 |
20 |
20.6577 |
- |
- |
- |
- |
- |
| 0.6061 |
30 |
20.6479 |
- |
- |
- |
- |
- |
| 0.8081 |
40 |
21.0398 |
- |
- |
- |
- |
- |
| 1.0 |
50 |
20.2131 |
0.3647 |
0.3809 |
0.3475 |
0.3206 |
0.2865 |
| 1.2020 |
60 |
19.2345 |
- |
- |
- |
- |
- |
| 1.4040 |
70 |
18.6065 |
- |
- |
- |
- |
- |
| 1.6061 |
80 |
16.8382 |
- |
- |
- |
- |
- |
| 1.8081 |
90 |
17.4581 |
- |
- |
- |
- |
- |
| 2.0 |
100 |
16.8996 |
0.4571 |
0.4535 |
0.4513 |
0.4101 |
0.3576 |
| 2.2020 |
110 |
17.4694 |
- |
- |
- |
- |
- |
| 2.4040 |
120 |
14.7442 |
- |
- |
- |
- |
- |
| 2.6061 |
130 |
12.601 |
- |
- |
- |
- |
- |
| 2.8081 |
140 |
13.037 |
- |
- |
- |
- |
- |
| 3.0 |
150 |
13.0811 |
0.4993 |
0.5003 |
0.4866 |
0.4555 |
0.3709 |
| 3.2020 |
160 |
11.8374 |
- |
- |
- |
- |
- |
| 3.4040 |
170 |
12.5389 |
- |
- |
- |
- |
- |
| 3.6061 |
180 |
14.3829 |
- |
- |
- |
- |
- |
| 3.8081 |
190 |
13.8871 |
- |
- |
- |
- |
- |
| 4.0 |
200 |
10.3684 |
0.5054 |
0.5020 |
0.4947 |
0.4597 |
0.3739 |
| 4.2020 |
210 |
12.6792 |
- |
- |
- |
- |
- |
| 4.4040 |
220 |
10.6044 |
- |
- |
- |
- |
- |
| 4.6061 |
230 |
12.015 |
- |
- |
- |
- |
- |
| 4.8081 |
240 |
10.7804 |
- |
- |
- |
- |
- |
| 5.0 |
250 |
9.439 |
0.5190 |
0.5098 |
0.5063 |
0.4589 |
0.3753 |
| 5.2020 |
260 |
10.8849 |
- |
- |
- |
- |
- |
| 5.4040 |
270 |
11.2237 |
- |
- |
- |
- |
- |
| 5.6061 |
280 |
9.7149 |
- |
- |
- |
- |
- |
| 5.8081 |
290 |
10.5259 |
- |
- |
- |
- |
- |
| 6.0 |
300 |
9.1578 |
0.5227 |
0.5169 |
0.5062 |
0.4667 |
0.3777 |
| 6.2020 |
310 |
10.6102 |
- |
- |
- |
- |
- |
| 6.4040 |
320 |
10.1176 |
- |
- |
- |
- |
- |
| 6.6061 |
330 |
8.3092 |
- |
- |
- |
- |
- |
| 6.8081 |
340 |
9.5087 |
- |
- |
- |
- |
- |
| 7.0 |
350 |
11.525 |
0.5252 |
0.5144 |
0.5092 |
0.4747 |
0.3706 |
| 7.2020 |
360 |
10.3263 |
- |
- |
- |
- |
- |
| 7.4040 |
370 |
9.7615 |
- |
- |
- |
- |
- |
| 7.6061 |
380 |
9.1261 |
- |
- |
- |
- |
- |
| 7.8081 |
390 |
9.6996 |
- |
- |
- |
- |
- |
| 8.0 |
400 |
8.4646 |
0.5324 |
0.5158 |
0.5082 |
0.4759 |
0.3719 |
| 8.2020 |
410 |
9.6561 |
- |
- |
- |
- |
- |
| 8.4040 |
420 |
9.504 |
- |
- |
- |
- |
- |
| 8.6061 |
430 |
7.4925 |
- |
- |
- |
- |
- |
| 8.8081 |
440 |
8.749 |
- |
- |
- |
- |
- |
| 9.0 |
450 |
9.5831 |
0.5282 |
0.5215 |
0.5038 |
0.4741 |
0.3721 |
| 9.2020 |
460 |
8.5261 |
- |
- |
- |
- |
- |
| 9.4040 |
470 |
9.2267 |
- |
- |
- |
- |
- |
| 9.6061 |
480 |
8.3529 |
- |
- |
- |
- |
- |
| 9.8081 |
490 |
8.391 |
- |
- |
- |
- |
- |
| 10.0 |
500 |
9.2313 |
0.5374 |
0.5219 |
0.5093 |
0.4768 |
0.3749 |
| 10.2020 |
510 |
10.6238 |
- |
- |
- |
- |
- |
| 10.4040 |
520 |
8.9972 |
- |
- |
- |
- |
- |
| 10.6061 |
530 |
8.0452 |
- |
- |
- |
- |
- |
| 10.8081 |
540 |
8.2937 |
- |
- |
- |
- |
- |
| 11.0 |
550 |
8.0842 |
0.5402 |
0.5158 |
0.5075 |
0.4788 |
0.389 |
| 11.2020 |
560 |
7.9855 |
- |
- |
- |
- |
- |
| 11.4040 |
570 |
9.1783 |
- |
- |
- |
- |
- |
| 11.6061 |
580 |
8.5681 |
- |
- |
- |
- |
- |
| 11.8081 |
590 |
9.0004 |
- |
- |
- |
- |
- |
| 12.0 |
600 |
7.8016 |
0.5402 |
0.5199 |
0.5078 |
0.4745 |
0.3836 |
| 12.2020 |
610 |
8.1169 |
- |
- |
- |
- |
- |
| 12.4040 |
620 |
8.7016 |
- |
- |
- |
- |
- |
| 12.6061 |
630 |
8.6899 |
- |
- |
- |
- |
- |
| 12.8081 |
640 |
8.1782 |
- |
- |
- |
- |
- |
| 13.0 |
650 |
7.8024 |
0.5361 |
0.5178 |
0.5065 |
0.4751 |
0.3864 |
| -1 |
-1 |
- |
0.5402 |
0.5158 |
0.5075 |
0.4788 |
0.3890 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.1.0
- Transformers: 4.51.3
- PyTorch: 2.8.0+cu126
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.21.4
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}