SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'What are the molecular mechanisms involved in the synergistic induction of SAA by IL-1, TNF-α, and IL-6?\n',
    'The complex formation of STAT3, NF-κB p65, and p300 is involved in the transcriptional activity of the SAA1 gene. STAT3 and p300 are recruited to the SAA1 promoter region in response to IL-6 or IL-1β + IL-6 stimulation. Co-expression of wild type p300 with wild type STAT3 enhances the luciferase activity of the SAA1 gene in a dose-dependent manner. This suggests that the heteromeric complex formation of STAT3, NF-κB p65, and p300 contributes to the transcriptional activity of the SAA1 gene.',
    'Phenotypic screens of approved drug collections and synergistic combinations can be a useful approach for rapid identification of new therapeutics for drug-resistant bacteria. This approach can also be applied to emerging outbreaks of infectious diseases where vaccines and therapeutic agents are unavailable or unrealistic to develop in a short period of time. By screening existing drugs and combinations, new therapeutics can be identified and potentially repurposed for the treatment of drug-resistant infections.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7925, 0.1356],
#         [0.7925, 1.0000, 0.1694],
#         [0.1356, 0.1694, 1.0000]])

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.7775
cosine_accuracy@3 0.8885
cosine_accuracy@5 0.917
cosine_accuracy@10 0.947
cosine_precision@1 0.7775
cosine_precision@3 0.2962
cosine_precision@5 0.1834
cosine_precision@10 0.0947
cosine_recall@1 0.7775
cosine_recall@3 0.8885
cosine_recall@5 0.917
cosine_recall@10 0.947
cosine_ndcg@10 0.8638
cosine_mrr@10 0.8369
cosine_map@100 0.8394

Information Retrieval

Metric Value
cosine_accuracy@1 0.7785
cosine_accuracy@3 0.8825
cosine_accuracy@5 0.917
cosine_accuracy@10 0.944
cosine_precision@1 0.7785
cosine_precision@3 0.2942
cosine_precision@5 0.1834
cosine_precision@10 0.0944
cosine_recall@1 0.7785
cosine_recall@3 0.8825
cosine_recall@5 0.917
cosine_recall@10 0.944
cosine_ndcg@10 0.8624
cosine_mrr@10 0.836
cosine_map@100 0.8389

Information Retrieval

Metric Value
cosine_accuracy@1 0.7555
cosine_accuracy@3 0.8655
cosine_accuracy@5 0.9145
cosine_accuracy@10 0.943
cosine_precision@1 0.7555
cosine_precision@3 0.2885
cosine_precision@5 0.1829
cosine_precision@10 0.0943
cosine_recall@1 0.7555
cosine_recall@3 0.8655
cosine_recall@5 0.9145
cosine_recall@10 0.943
cosine_ndcg@10 0.85
cosine_mrr@10 0.8199
cosine_map@100 0.8225

Information Retrieval

Metric Value
cosine_accuracy@1 0.714
cosine_accuracy@3 0.8365
cosine_accuracy@5 0.877
cosine_accuracy@10 0.9285
cosine_precision@1 0.714
cosine_precision@3 0.2788
cosine_precision@5 0.1754
cosine_precision@10 0.0929
cosine_recall@1 0.714
cosine_recall@3 0.8365
cosine_recall@5 0.877
cosine_recall@10 0.9285
cosine_ndcg@10 0.8196
cosine_mrr@10 0.7848
cosine_map@100 0.7878

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,000 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 8 tokens
    • mean: 20.92 tokens
    • max: 51 tokens
    • min: 30 tokens
    • mean: 116.22 tokens
    • max: 227 tokens
  • Samples:
    anchor positive
    What are the common clinical features and diagnostic criteria of relapsing polychondritis?
    Lethal complications of relapsing polychondritis are often associated with airway or cardiovascular involvement. This can include complications such as aortic incompetence, mitral regurgitation, pericarditis, cardiac ischemia, aneurysms of large arteries, vasculitis of the central nervous system, phlebitis, and Raynaud's phenomenon. Neurological and renal system involvement can also occur, although it is rare. Regular follow-up and management are important to monitor and prevent potential complications in patients with relapsing polychondritis.
    What are the treatment options for relapsing polychondritis?
    Lethal complications of relapsing polychondritis are often associated with airway or cardiovascular involvement. This can include complications such as aortic incompetence, mitral regurgitation, pericarditis, cardiac ischemia, aneurysms of large arteries, vasculitis of the central nervous system, phlebitis, and Raynaud's phenomenon. Neurological and renal system involvement can also occur, although it is rare. Regular follow-up and management are important to monitor and prevent potential complications in patients with relapsing polychondritis.
    What are the potential complications associated with relapsing polychondritis?
    Lethal complications of relapsing polychondritis are often associated with airway or cardiovascular involvement. This can include complications such as aortic incompetence, mitral regurgitation, pericarditis, cardiac ischemia, aneurysms of large arteries, vasculitis of the central nervous system, phlebitis, and Raynaud's phenomenon. Neurological and renal system involvement can also occur, although it is rare. Regular follow-up and management are important to monitor and prevent potential complications in patients with relapsing polychondritis.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • gradient_accumulation_steps: 4
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • warmup_steps: 0.1
  • bf16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.1
  • warmup_steps: 0.1
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: True
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
-1 -1 - 0.8142 0.8058 0.7676 0.7053
0.032 1 1.5764 0.8146 0.8055 0.7669 0.7049
0.064 2 2.6620 0.8162 0.8077 0.7690 0.7086
0.096 3 1.9032 0.8204 0.8126 0.7759 0.7173
0.128 4 1.6601 0.8252 0.8177 0.7849 0.7282
0.16 5 1.1083 0.8315 0.8251 0.7902 0.7419
0.192 6 2.7345 0.8361 0.8317 0.7970 0.7510
0.224 7 1.2922 0.8375 0.8351 0.8025 0.7620
0.256 8 1.6647 0.8399 0.8367 0.8080 0.7686
0.288 9 1.1997 0.8425 0.8398 0.8133 0.7754
0.32 10 0.8064 0.8441 0.8419 0.8181 0.7799
0.352 11 1.1935 0.8468 0.8442 0.8220 0.7843
0.384 12 0.7776 0.8482 0.8462 0.8242 0.7886
0.416 13 0.9272 0.8494 0.8484 0.8261 0.7940
0.448 14 1.2406 0.8510 0.8502 0.8294 0.7978
0.48 15 1.0830 0.8520 0.8518 0.8325 0.7999
0.512 16 1.9336 0.8534 0.8532 0.8340 0.8017
0.544 17 1.2190 0.8541 0.8537 0.8360 0.8026
0.576 18 1.7060 0.8554 0.8545 0.8388 0.8063
0.608 19 1.4131 0.8571 0.8561 0.8412 0.8084
0.64 20 1.1700 0.8581 0.8569 0.8429 0.8101
0.672 21 0.5671 0.8599 0.8580 0.8445 0.8118
0.704 22 1.4699 0.8613 0.8596 0.8455 0.8140
0.736 23 1.6544 0.8620 0.8608 0.8463 0.8158
0.768 24 2.0854 0.8624 0.8614 0.8476 0.8169
0.8 25 0.9175 0.8630 0.8616 0.8484 0.8180
0.832 26 1.3673 0.8632 0.8615 0.8485 0.8182
0.864 27 1.2114 0.8637 0.8617 0.8491 0.8190
0.896 28 0.9807 0.8637 0.8620 0.8497 0.8190
0.928 29 0.9052 0.8635 0.8620 0.8497 0.8192
0.96 30 1.7420 0.8640 0.8624 0.8500 0.8194
0.992 31 1.3071 0.8640 0.8622 0.8497 0.8193
1.0 32 1.3117 0.8638 0.8624 0.8500 0.8196

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.2.3
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
47
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tien314/miriad-embedding

Finetuned
(353)
this model

Papers for tien314/miriad-embedding

Evaluation results