mpnet_MH_embedding / README.md
FritzStack's picture
Add new SentenceTransformer model.
883b1e1 verified
|
raw
history blame
18 kB
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:4615
  - loss:TripletLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
  - source_sentence: Do you ever feel like you have failed in life or let yourself down?
    sentences:
      - >-
        But I just don't feel like even getting started because I know that I
        will fail again.
      - I cant remember the last time I felt happiness.
      - That was their biggest and last mistake.
  - source_sentence: Do you feel sad or unhappy?
    sentences:
      - I have been depressed since late September so I feel you.
      - I share a lot of your traits, and considered myself a failure too.
      - He conveys that feeling of regret so well I can feel it everytime
  - source_sentence: Do you feel hopeful about your future or do things seem hopeless?
    sentences:
      - >-
        I'm pretty optimistic though since the pace of technological growth is
        accelerating so rapidly.
      - >-
        [For a clickable image, click
        here](http://futurism.com/thisweekinscience)


        [To get these images directly to your inbox, sign up
        here](http://futurism.com/subscribe)


        _


        Sources | Reddit

        --- | ---

        [Oldest and Furthest
        Galaxy](http://futurism.com/links/astronomers-discover-the-oldest-and-farthest-known-galaxy/)
        |
        [Reddit](https://www.reddit.com/r/science/comments/3jypyf/researchers_find_132_billion_yearold_galaxy_in/)

        [3D Printed Ribs
        ](http://futurism.com/links/these-3d-printed-titanium-ribs-were-successfully-implanted-in-a-person/)
        |
        [Reddit](https://www.reddit.com/r/technology/comments/3kj8pf/patient_receives_3dprinted_titanium_sternum_and/?ref=search_posts)

        [Chinese Far Side of Moon]
        (http://m.phys.org/news/2015-09-china-aims-probe-moon-side.html)  |
        [Reddit](https://www.reddit.com/r/worldnews/comments/3kcsg5/china_to_explore_dark_side_of_the_moon_china_has/)

        [Rugby Ball
        Molecule](http://www.forbes.com/sites/carmendrahl/2015/09/02/giant-rugby-ball-new-interaction-chemistry/)
        |
        [Reddit](https://www.reddit.com/r/EverythingScience/comments/3krt22/this_giant_rugby_ball_contains_a_new_chemical/)

        [Measuring the
        Universe](http://astronomynow.com/2015/09/04/using-stellar-twins-to-climb-the-cosmic-distance-ladder/)
        |
        [Reddit](https://www.reddit.com/r/science/comments/3jum8c/astronomers_have_developed_a_new_highly_accurate/)

        [3D Printed Stethoscope
        ](http://futurism.com/links/3d-printed-stethoscopes-cost-as-little-as-2-50-and-are-just-as-good/)
        |
        [Reddit](https://www.reddit.com/r/news/comments/3kgboz/doctor_3d_prints_stethoscope_to_alleviate_supply/)

        [Giant Structure in
        Universe](http://phys.org/news/2015-09-giant-ring-like-universe.html) |
        [Reddit](https://www.reddit.com/r/EverythingScience/comments/3jzjlm/surprising_giant_ringlike_structure_in_the/)

        [Recoded Cell
        Factories](http://m.phys.org/news/2015-09-recoded-cells-factories-proteins.html)
        |
        [Reddit](https://www.reddit.com/r/EverythingScience/comments/3krux3/researchers_transform_recoded_cells_into/)
      - I do not expect things to work out for me.
  - source_sentence: Do you feel sad or unhappy?
    sentences:
      - Me everyday im depressing
      - And now I feel very alone and useless.
      - >-
        Sucks that I'm not the only one because others are suffering, but it's
        nice to know I'm not alone.
  - source_sentence: Do you feel sad or unhappy?
    sentences:
      - I cried because I lost not only my money, but because I lost myself.
      - Im not exactly depressed, at least not all of the time.
      - does anyone feel like they cant be sad
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("FritzStack/mpnet_MH_embedding")
# Run inference
sentences = [
    'Do you feel sad or unhappy?',
    'Im not exactly depressed, at least not all of the time.',
    'does anyone feel like they cant be sad',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.7532, -0.4572],
#         [ 0.7532,  1.0000, -0.0545],
#         [-0.4572, -0.0545,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,615 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 9 tokens
    • mean: 13.63 tokens
    • max: 17 tokens
    • min: 5 tokens
    • mean: 20.7 tokens
    • max: 169 tokens
    • min: 4 tokens
    • mean: 42.11 tokens
    • max: 384 tokens
  • Samples:
    anchor positive negative
    Do you feel sad or unhappy? I do not feel sad. I've been suffering my whole life, and it's currently at its peak :(
    Do you feel sad or unhappy? I feel sad much of the time. Things will get better, just focus more in the positive rather than the negative
    Do you feel sad or unhappy? I am sad all the time. That's why I understand I'm terrible, because it's wrong I get annoyed by that, people should do what they want, but I just can't stand being alone.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 2
  • gradient_accumulation_steps: 8
  • warmup_steps: 100
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 100
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0347 10 0.3032
0.0693 20 0.2893
0.1040 30 0.2275
0.1386 40 0.1532
0.1733 50 0.1947
0.2080 60 0.1126
0.2426 70 0.1047
0.2773 80 0.1118
0.3120 90 0.0839
0.3466 100 0.1147
0.3813 110 0.111
0.4159 120 0.0754
0.4506 130 0.0964
0.4853 140 0.1269
0.5199 150 0.0795
0.5546 160 0.1042
0.5893 170 0.0797
0.6239 180 0.0685
0.6586 190 0.0819
0.6932 200 0.0802
0.7279 210 0.0934
0.7626 220 0.0865
0.7972 230 0.0731
0.8319 240 0.0486
0.8666 250 0.075
0.9012 260 0.0627
0.9359 270 0.0844
0.9705 280 0.0776
1.0035 290 0.0707
1.0381 300 0.0479
1.0728 310 0.05
1.1075 320 0.0317
1.1421 330 0.0263
1.1768 340 0.0321
1.2114 350 0.0221
1.2461 360 0.0337
1.2808 370 0.0301
1.3154 380 0.034
1.3501 390 0.0379
1.3847 400 0.0489
1.4194 410 0.0303
1.4541 420 0.0263
1.4887 430 0.0342
1.5234 440 0.0328
1.5581 450 0.0431
1.5927 460 0.0472
1.6274 470 0.0353
1.6620 480 0.0389
1.6967 490 0.0216
1.7314 500 0.0351
1.7660 510 0.0386
1.8007 520 0.039
1.8354 530 0.0264
1.8700 540 0.0295
1.9047 550 0.0329
1.9393 560 0.0487
1.9740 570 0.0287
2.0069 580 0.0306
2.0416 590 0.0171
2.0763 600 0.009
2.1109 610 0.017
2.1456 620 0.0252
2.1802 630 0.0123
2.2149 640 0.0144
2.2496 650 0.0187
2.2842 660 0.02
2.3189 670 0.0065
2.3536 680 0.0131
2.3882 690 0.0138
2.4229 700 0.0111
2.4575 710 0.0108
2.4922 720 0.0079
2.5269 730 0.0062
2.5615 740 0.0105
2.5962 750 0.0095
2.6308 760 0.0112
2.6655 770 0.0052
2.7002 780 0.0103
2.7348 790 0.0108
2.7695 800 0.0059
2.8042 810 0.0099
2.8388 820 0.0142
2.8735 830 0.0112
2.9081 840 0.0194
2.9428 850 0.0128
2.9775 860 0.0093

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.1
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}