--- tags: - sentence-transformers - sentence-similarity - feature-extraction - dense - generated_from_trainer - dataset_size:4615 - loss:TripletLoss base_model: sentence-transformers/all-mpnet-base-v2 widget: - source_sentence: Do you ever feel like you have failed in life or let yourself down? sentences: - But I just don't feel like even getting started because I know that I will fail again. - I cant remember the last time I felt happiness. - That was their biggest and last mistake. - source_sentence: Do you feel sad or unhappy? sentences: - I have been depressed since late September so I feel you. - I share a lot of your traits, and considered myself a failure too. - He conveys that feeling of regret so well I can feel it everytime - source_sentence: Do you feel hopeful about your future or do things seem hopeless? sentences: - I'm pretty optimistic though since the pace of technological growth is accelerating so rapidly. - '[For a clickable image, click here](http://futurism.com/thisweekinscience) [To get these images directly to your inbox, sign up here](http://futurism.com/subscribe) _ Sources | Reddit --- | --- [Oldest and Furthest Galaxy](http://futurism.com/links/astronomers-discover-the-oldest-and-farthest-known-galaxy/) | [Reddit](https://www.reddit.com/r/science/comments/3jypyf/researchers_find_132_billion_yearold_galaxy_in/) [3D Printed Ribs ](http://futurism.com/links/these-3d-printed-titanium-ribs-were-successfully-implanted-in-a-person/) | [Reddit](https://www.reddit.com/r/technology/comments/3kj8pf/patient_receives_3dprinted_titanium_sternum_and/?ref=search_posts) [Chinese Far Side of Moon] (http://m.phys.org/news/2015-09-china-aims-probe-moon-side.html) | [Reddit](https://www.reddit.com/r/worldnews/comments/3kcsg5/china_to_explore_dark_side_of_the_moon_china_has/) [Rugby Ball Molecule](http://www.forbes.com/sites/carmendrahl/2015/09/02/giant-rugby-ball-new-interaction-chemistry/) | [Reddit](https://www.reddit.com/r/EverythingScience/comments/3krt22/this_giant_rugby_ball_contains_a_new_chemical/) [Measuring the Universe](http://astronomynow.com/2015/09/04/using-stellar-twins-to-climb-the-cosmic-distance-ladder/) | [Reddit](https://www.reddit.com/r/science/comments/3jum8c/astronomers_have_developed_a_new_highly_accurate/) [3D Printed Stethoscope ](http://futurism.com/links/3d-printed-stethoscopes-cost-as-little-as-2-50-and-are-just-as-good/) | [Reddit](https://www.reddit.com/r/news/comments/3kgboz/doctor_3d_prints_stethoscope_to_alleviate_supply/) [Giant Structure in Universe](http://phys.org/news/2015-09-giant-ring-like-universe.html) | [Reddit](https://www.reddit.com/r/EverythingScience/comments/3jzjlm/surprising_giant_ringlike_structure_in_the/) [Recoded Cell Factories](http://m.phys.org/news/2015-09-recoded-cells-factories-proteins.html) | [Reddit](https://www.reddit.com/r/EverythingScience/comments/3krux3/researchers_transform_recoded_cells_into/)' - I do not expect things to work out for me. - source_sentence: Do you feel sad or unhappy? sentences: - Me everyday im depressing - And now I feel very alone and useless. - Sucks that I'm not the only one because others are suffering, but it's nice to know I'm not alone. - source_sentence: Do you feel sad or unhappy? sentences: - I cried because I lost not only my money, but because I lost myself. - Im not exactly depressed, at least not all of the time. - does anyone feel like they cant be sad pipeline_tag: sentence-similarity library_name: sentence-transformers --- # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) - **Maximum Sequence Length:** 384 tokens - **Output Dimensionality:** 768 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'}) (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("FritzStack/mpnet_MH_embedding") # Run inference sentences = [ 'Do you feel sad or unhappy?', 'Im not exactly depressed, at least not all of the time.', 'does anyone feel like they cant be sad', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[ 1.0000, 0.7532, -0.4572], # [ 0.7532, 1.0000, -0.0545], # [-0.4572, -0.0545, 1.0000]]) ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 4,615 training samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:-----------------------------------------|:------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Do you feel sad or unhappy? | I do not feel sad. | I've been suffering my whole life, and it's currently at its peak :( | | Do you feel sad or unhappy? | I feel sad much of the time. | Things will get better, just focus more in the positive rather than the negative | | Do you feel sad or unhappy? | I am sad all the time. | That's why I understand I'm terrible, because it's wrong I get annoyed by that, people should do what they want, but I just can't stand being alone. | * Loss: [TripletLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters: ```json { "distance_metric": "TripletDistanceMetric.COSINE", "triplet_margin": 0.5 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 2 - `gradient_accumulation_steps`: 8 - `warmup_steps`: 100 - `fp16`: True #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 2 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 8 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 3 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 100 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `parallelism_config`: None - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `project`: huggingface - `trackio_space_id`: trackio - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `hub_revision`: None - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: no - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `liger_kernel_config`: None - `eval_use_gather_object`: False - `average_tokens_across_devices`: True - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional - `router_mapping`: {} - `learning_rate_mapping`: {}
### Training Logs | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 0.0347 | 10 | 0.3032 | | 0.0693 | 20 | 0.2893 | | 0.1040 | 30 | 0.2275 | | 0.1386 | 40 | 0.1532 | | 0.1733 | 50 | 0.1947 | | 0.2080 | 60 | 0.1126 | | 0.2426 | 70 | 0.1047 | | 0.2773 | 80 | 0.1118 | | 0.3120 | 90 | 0.0839 | | 0.3466 | 100 | 0.1147 | | 0.3813 | 110 | 0.111 | | 0.4159 | 120 | 0.0754 | | 0.4506 | 130 | 0.0964 | | 0.4853 | 140 | 0.1269 | | 0.5199 | 150 | 0.0795 | | 0.5546 | 160 | 0.1042 | | 0.5893 | 170 | 0.0797 | | 0.6239 | 180 | 0.0685 | | 0.6586 | 190 | 0.0819 | | 0.6932 | 200 | 0.0802 | | 0.7279 | 210 | 0.0934 | | 0.7626 | 220 | 0.0865 | | 0.7972 | 230 | 0.0731 | | 0.8319 | 240 | 0.0486 | | 0.8666 | 250 | 0.075 | | 0.9012 | 260 | 0.0627 | | 0.9359 | 270 | 0.0844 | | 0.9705 | 280 | 0.0776 | | 1.0035 | 290 | 0.0707 | | 1.0381 | 300 | 0.0479 | | 1.0728 | 310 | 0.05 | | 1.1075 | 320 | 0.0317 | | 1.1421 | 330 | 0.0263 | | 1.1768 | 340 | 0.0321 | | 1.2114 | 350 | 0.0221 | | 1.2461 | 360 | 0.0337 | | 1.2808 | 370 | 0.0301 | | 1.3154 | 380 | 0.034 | | 1.3501 | 390 | 0.0379 | | 1.3847 | 400 | 0.0489 | | 1.4194 | 410 | 0.0303 | | 1.4541 | 420 | 0.0263 | | 1.4887 | 430 | 0.0342 | | 1.5234 | 440 | 0.0328 | | 1.5581 | 450 | 0.0431 | | 1.5927 | 460 | 0.0472 | | 1.6274 | 470 | 0.0353 | | 1.6620 | 480 | 0.0389 | | 1.6967 | 490 | 0.0216 | | 1.7314 | 500 | 0.0351 | | 1.7660 | 510 | 0.0386 | | 1.8007 | 520 | 0.039 | | 1.8354 | 530 | 0.0264 | | 1.8700 | 540 | 0.0295 | | 1.9047 | 550 | 0.0329 | | 1.9393 | 560 | 0.0487 | | 1.9740 | 570 | 0.0287 | | 2.0069 | 580 | 0.0306 | | 2.0416 | 590 | 0.0171 | | 2.0763 | 600 | 0.009 | | 2.1109 | 610 | 0.017 | | 2.1456 | 620 | 0.0252 | | 2.1802 | 630 | 0.0123 | | 2.2149 | 640 | 0.0144 | | 2.2496 | 650 | 0.0187 | | 2.2842 | 660 | 0.02 | | 2.3189 | 670 | 0.0065 | | 2.3536 | 680 | 0.0131 | | 2.3882 | 690 | 0.0138 | | 2.4229 | 700 | 0.0111 | | 2.4575 | 710 | 0.0108 | | 2.4922 | 720 | 0.0079 | | 2.5269 | 730 | 0.0062 | | 2.5615 | 740 | 0.0105 | | 2.5962 | 750 | 0.0095 | | 2.6308 | 760 | 0.0112 | | 2.6655 | 770 | 0.0052 | | 2.7002 | 780 | 0.0103 | | 2.7348 | 790 | 0.0108 | | 2.7695 | 800 | 0.0059 | | 2.8042 | 810 | 0.0099 | | 2.8388 | 820 | 0.0142 | | 2.8735 | 830 | 0.0112 | | 2.9081 | 840 | 0.0194 | | 2.9428 | 850 | 0.0128 | | 2.9775 | 860 | 0.0093 | ### Framework Versions - Python: 3.12.12 - Sentence Transformers: 5.1.1 - Transformers: 4.57.1 - PyTorch: 2.8.0+cu126 - Accelerate: 1.10.1 - Datasets: 4.0.0 - Tokenizers: 0.22.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### TripletLoss ```bibtex @misc{hermans2017defense, title={In Defense of the Triplet Loss for Person Re-Identification}, author={Alexander Hermans and Lucas Beyer and Bastian Leibe}, year={2017}, eprint={1703.07737}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```