SentenceTransformer based on sucharush/e5_stem_finetuned

This is a sentence-transformers model finetuned from sucharush/e5_stem_finetuned. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sucharush/e5_stem_finetuned
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Stergios-Konstantinidis/MNLP_M3_tokenizer_tuned")
# Run inference
sentences = [
    '[{\'from\': \'human\', \'value\': \'Solve the following math problem step-by-step.\\nSimplify your answer as much as possible. Present your final answer as \\\\boxed{Your Answer}.\\nA pentagon is drawn by placing an isosceles right triangle on top of a square as pictured. What percent of the area of the pentagon is the area of the right triangle?\\n\\n[asy]\\nsize(50);\\ndraw((0,0)--(0,-1)--(1,-1)--(1,0)--(0,0)--(.5,.5)--(1,0));\\n[/asy]\'}, {\'from\': \'gpt\', \'value\': "Step 1:  We are given a pentagon formed by placing an isosceles right triangle on top of a square.\\nStep 2:  To find the percentage of the area of the pentagon that is the right triangle, we need to consider the areas of both shapes and then calculate the ratio of the triangle\'s area to the total pentagon\'s area.\\nStep 3:  Let\'s designate the leg length of the isosceles right triangle as $x$. This means the hypotenuse of the triangle has a length of $x\\\\sqrt{2}$.\\nStep 4:  Since the hypotenuse of the triangle is a side of the square, the area of the square is $(x\\\\sqrt{2})^2 = 2x^2$. \\nStep 5:  On the other hand, the area of the triangle is $(x)(x)/2 = x^2/2$. This is obtained using the standard formula for the area of a triangle, which is base times height divided by 2.\\nStep 6:  So, the area of the pentagon is the sum of the areas of the triangle and the square, which gives us $\\\\frac{x^2}{2} + 2x^2 = \\\\frac{5x^2}{2}$.\\nStep 7:  Now, to find the percentage of the pentagon\'s area that is inside the triangle, we calculate the ratio of the triangle\'s area to the total pentagon\'s area. This can be computed as $\\\\frac{x^2/2}{5x^2/2} =\\\\frac{x^2}{2}\\\\cdot \\\\frac{2}{5x^2} = \\\\frac15$.\\nStep 8:  Therefore, the percentage of the area of the pentagon that is the right triangle is $\\\\frac{1}{5}$, which is equivalent to $\\\\boxed{20\\\\%}$.\\nStep 9:  As an alternate solution, we can consider drawing the two diagonals of the square and observe the resulting geometric properties to arrive at the same conclusion.\\n\\nAnswer:\\n\\\\boxed{20\\\\%}"}]',
    '[{\'from\': \'human\', \'value\': \'Solve the following math problem step-by-step.\\nSimplify your answer as much as possible. Present your final answer as \\\\boxed{Your Answer}.\\nA pentagon is drawn by placing an isosceles right triangle on top of a square as pictured. What percent of the area of the pentagon is the area of the right triangle?\\n\\n[asy]\\nsize(50);\\ndraw((0,0)--(0,-1)--(1,-1)--(1,0)--(0,0)--(.5,.5)--(1,0));\\n[/asy]\'}, {\'from\': \'gpt\', \'value\': "Step 1:  We are given a pentagon formed by placing an isosceles right triangle on top of a square.\\nStep 2:  To find the percentage of the area of the pentagon that is the right triangle, we need to consider the areas of both shapes and then calculate the ratio of the triangle\'s area to the total pentagon\'s area.\\nStep 3:  Let\'s designate the leg length of the isosceles right triangle as $x$. This means the hypotenuse of the triangle has a length of $x\\\\sqrt{2}$.\\nStep 4:  Since the hypotenuse of the triangle is a side of the square, the area of the square is $(x\\\\sqrt{2})^2 = 2x^2$. \\nStep 5:  On the other hand, the area of the triangle is $(x)(x)/2 = x^2/2$. This is obtained using the standard formula for the area of a triangle, which is base times height divided by 2.\\nStep 6:  So, the area of the pentagon is the sum of the areas of the triangle and the square, which gives us $\\\\frac{x^2}{2} + 2x^2 = \\\\frac{5x^2}{2}$.\\nStep 7:  Now, to find the percentage of the pentagon\'s area that is inside the triangle, we calculate the ratio of the triangle\'s area to the total pentagon\'s area. This can be computed as $\\\\frac{x^2/2}{5x^2/2} =\\\\frac{x^2}{2}\\\\cdot \\\\frac{2}{5x^2} = \\\\frac15$.\\nStep 8:  Therefore, the percentage of the area of the pentagon that is the right triangle is $\\\\frac{1}{5}$, which is equivalent to $\\\\boxed{20\\\\%}$.\\nStep 9:  As an alternate solution, we can consider drawing the two diagonals of the square and observe the resulting geometric properties to arrive at the same conclusion.\\n\\nAnswer:\\n\\\\boxed{20\\\\%}"}]',
    'Question: where are the basilar membrane and the organ of corti located, Answer: Organ of Corti The organ of Corti is located in the scala media of the cochlea of the inner ear between the vestibular duct and the tympanic duct and is composed of mechanosensory cells, known as hair cells.[2] Strategically positioned on the basilar membrane of the organ of Corti are three rows of outer hair cells (OHCs) and one row of inner hair cells (IHCs).[4] Separating these hair cells are supporting cells: Deiters cells, also called phalangeal cells, which separate and support both the OHCs and the IHCs.[4]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 113,450 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string int
    details
    • min: 17 tokens
    • mean: 283.42 tokens
    • max: 512 tokens
    • min: 16 tokens
    • mean: 281.46 tokens
    • max: 512 tokens
    • 0: ~80.00%
    • 1: ~20.00%
  • Samples:
    sentence_0 sentence_1 label
    Question: where does rasin in the sun take place, Answer: A Raisin in the Sun A Raisin in the Sun is a play by Lorraine Hansberry that debuted on Broadway in 1959.[1] The title comes from the poem "Harlem" (also known as "A Dream Deferred"[2]) by Langston Hughes. The story tells of a black family's experiences in the Washington Park Subdivision of Chicago's Woodlawn neighborhood as they attempt to "better" themselves with an insurance payout following the death of the father. The New York Drama Critics' Circle named it the best play of 1959. Question: where does rasin in the sun take place, Answer: A Raisin in the Sun A Raisin in the Sun is a play by Lorraine Hansberry that debuted on Broadway in 1959.[1] The title comes from the poem "Harlem" (also known as "A Dream Deferred"[2]) by Langston Hughes. The story tells of a black family's experiences in the Washington Park Subdivision of Chicago's Woodlawn neighborhood as they attempt to "better" themselves with an insurance payout following the death of the father. The New York Drama Critics' Circle named it the best play of 1959. 1
    Question: when does the movie midnight sun come out, Answer: Midnight Sun (2018 film) Midnight Sun is a 2018 American romantic drama film directed by Scott Speer and written by Eric Kirsten, based on the 2006 Japanese film of the same name. The film stars Bella Thorne, Patrick Schwarzenegger, and Rob Riggle, and follows a teenage girl with the disease xeroderma pigmentosum, which prevents her from going out into sunlight. When she meets a boy, she struggles to decide whether to tell him about her condition or pretend to live a normal life. Principal photography began on October 12, 2015 in Vancouver, British Columbia, Canada. The film was released in the United States on March 23, 2018. Question: who sang the song unchained melody in the movie ghost, Answer: Unchained Melody "Unchained Melody" reappeared on the Billboard charts in 1990 after The Righteous Brothers' recording was used in the box office blockbuster film Ghost. Two versions charted in the US that year – the original and a new recording. According to Medley, he was interested in having the original recording released due to the renewed interest in the song, but was told that there were licensing issues. Although Hatfield was no longer in the same condition vocally as when he first recorded the song, they decided to re-record the song for Curb Records.[44] The re-recorded version was released as both a cassette single[45] and a CD single.[46] It received minimal airplay but recorded excellent sales, peaking at number 19. The re-recorded version was certified Platinum by the RIAA on January 10, 1991,[47] and received a Grammy Award nomination.[48] 0
    Question: where is the o2 arena in london located, Answer: The O2 Arena The O2 Arena (temporarily the sponsor neutral "North Greenwich Arena", during the 2012 Summer Olympics and 2012 Summer Paralympics), is a multi purpose indoor arena located in the centre of The O2 entertainment complex on the Greenwich Peninsula in south east London. [{'from': 'human', 'value': "Explain how Archimedes' Principle enables objects to float in water."}, {'from': 'gpt', 'value': "Archimedes' Principle states that the buoyant force acting on an object submerged in water is equal to the weight of the water displaced by the object. If the buoyant force exceeds the weight of the object, it will float."}] 0
  • Loss: ContrastiveTensionLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss
0.0441 500 10.4083
0.0881 1000 7.1707
0.1322 1500 5.1666
0.1763 2000 5.2289
0.2204 2500 5.1515
0.2644 3000 5.0737
0.3085 3500 5.0824
0.3526 4000 5.0569
0.3967 4500 5.2043
0.4407 5000 5.1293
0.4848 5500 5.1169
0.5289 6000 5.0743
0.5729 6500 5.1535
0.6170 7000 5.2208
0.6611 7500 5.1847
0.7052 8000 5.1545
0.7492 8500 5.0944
0.7933 9000 5.0662
0.8374 9500 5.1083
0.8814 10000 5.102
0.9255 10500 5.2018
0.9696 11000 5.1479
1.0137 11500 5.1877
1.0577 12000 5.2397
1.1018 12500 5.1657
1.1459 13000 5.2276
1.1900 13500 5.2295
1.2340 14000 5.1096
1.2781 14500 5.0877
1.3222 15000 5.0776
1.3662 15500 5.0957
1.4103 16000 5.1655
1.4544 16500 5.0537
1.4985 17000 5.0836
1.5425 17500 5.1495
1.5866 18000 5.0776
1.6307 18500 5.1215
1.6747 19000 5.1614
1.7188 19500 5.0195
1.7629 20000 5.0855
1.8070 20500 5.2153
1.8510 21000 5.1569
1.8951 21500 5.1213
1.9392 22000 5.1694
1.9833 22500 5.1414
2.0273 23000 5.1314
2.0714 23500 5.2454
2.1155 24000 5.1414
2.1595 24500 5.1194
2.2036 25000 5.1115
2.2477 25500 5.0393
2.2918 26000 5.1154
2.3358 26500 5.1414
2.3799 27000 5.0794
2.4240 27500 5.1013
2.4680 28000 5.1873
2.5121 28500 5.1034
2.5562 29000 5.1593
2.6003 29500 5.1933
2.6443 30000 5.2192
2.6884 30500 5.1233
2.7325 31000 5.1033
2.7766 31500 5.1408
2.8206 32000 5.1233
2.8647 32500 5.1153
2.9088 33000 5.0795
2.9528 33500 5.1792
2.9969 34000 5.1233
3.0410 34500 5.0852
3.0851 35000 5.0653
3.1291 35500 5.0973
3.1732 36000 5.1212
3.2173 36500 5.1046
3.2613 37000 5.1193
3.3054 37500 5.1632
3.3495 38000 5.1932
3.3936 38500 5.0894
3.4376 39000 5.1493
3.4817 39500 5.1812
3.5258 40000 5.1612
3.5699 40500 5.0972
3.6139 41000 5.0872
3.6580 41500 5.2375
3.7021 42000 5.1652
3.7461 42500 5.1273
3.7902 43000 5.0695
3.8343 43500 5.1557
3.8784 44000 5.0697
3.9224 44500 5.1635
3.9665 45000 5.2095
4.0106 45500 5.0936
4.0546 46000 5.1435
4.0987 46500 5.1635
4.1428 47000 5.1415
4.1869 47500 5.1634
4.2309 48000 5.1595
4.2750 48500 5.0775
4.3191 49000 5.0976
4.3632 49500 5.1135
4.4072 50000 5.0877
4.4513 50500 5.1934
4.4954 51000 5.0915
4.5394 51500 5.1295
4.5835 52000 5.0994
4.6276 52500 5.1235
4.6717 53000 5.1435
4.7157 53500 5.1335
4.7598 54000 5.1735
4.8039 54500 5.0914
4.8480 55000 5.1673
4.8920 55500 5.1834
4.9361 56000 5.1434
4.9802 56500 5.1713
5.0242 57000 5.1034
5.0683 57500 5.0615
5.1124 58000 5.0835
5.1565 58500 5.0654
5.2005 59000 5.1813
5.2446 59500 5.2133
5.2887 60000 5.2273
5.3327 60500 5.1313
5.3768 61000 5.1415
5.4209 61500 5.2432
5.4650 62000 5.0854
5.5090 62500 5.1415
5.5531 63000 5.0834
5.5972 63500 5.1294
5.6413 64000 5.1234
5.6853 64500 5.1913
5.7294 65000 5.1793
5.7735 65500 5.0614
5.8175 66000 5.1014
5.8616 66500 5.0354
5.9057 67000 5.1913
5.9498 67500 5.0614
5.9938 68000 5.2453
6.0379 68500 5.0913
6.0820 69000 5.1135
6.1260 69500 5.1374
6.1701 70000 5.0933
6.2142 70500 5.0134
6.2583 71000 5.0435
6.3023 71500 5.2412
6.3464 72000 5.0754
6.3905 72500 5.1933
6.4346 73000 5.2932
6.4786 73500 5.0515
6.5227 74000 5.2073
6.5668 74500 5.1312
6.6108 75000 5.1332
6.6549 75500 5.0615
6.6990 76000 5.1873
6.7431 76500 5.0654
6.7871 77000 5.1333
6.8312 77500 5.1593
6.8753 78000 5.1234
6.9193 78500 5.2012
6.9634 79000 5.1513
7.0075 79500 5.1153
7.0516 80000 5.0414
7.0956 80500 5.2034
7.1397 81000 5.1073
7.1838 81500 5.1753
7.2279 82000 5.1233
7.2719 82500 5.1293
7.3160 83000 5.1492
7.3601 83500 5.2392
7.4041 84000 5.0033
7.4482 84500 5.1912
7.4923 85000 5.1673
7.5364 85500 5.1932
7.5804 86000 5.1393
7.6245 86500 5.1653
7.6686 87000 5.1232
7.7126 87500 5.0993
7.7567 88000 5.1313
7.8008 88500 5.0695
7.8449 89000 5.1493
7.8889 89500 5.1733
7.9330 90000 5.1233
7.9771 90500 5.1292
8.0212 91000 5.0254
8.0652 91500 5.2092
8.1093 92000 5.1313
8.1534 92500 5.0834
8.1974 93000 5.1032
8.2415 93500 5.0653
8.2856 94000 5.1832
8.3297 94500 5.1752
8.3737 95000 5.0752
8.4178 95500 5.1593
8.4619 96000 5.1633
8.5059 96500 5.0794
8.5500 97000 5.1991
8.5941 97500 5.0952
8.6382 98000 5.1153
8.6822 98500 5.1192
8.7263 99000 5.1053
8.7704 99500 5.2213
8.8145 100000 5.1571
8.8585 100500 5.1292
8.9026 101000 5.1531
8.9467 101500 5.2251
8.9907 102000 5.0733
9.0348 102500 5.1313
9.0789 103000 5.1973
9.1230 103500 5.1133
9.1670 104000 5.1813
9.2111 104500 5.2472
9.2552 105000 5.1352
9.2993 105500 5.0373
9.3433 106000 5.1791
9.3874 106500 5.1891
9.4315 107000 5.0832
9.4755 107500 5.1632
9.5196 108000 5.1012
9.5637 108500 5.0354
9.6078 109000 5.2112
9.6518 109500 5.1212
9.6959 110000 5.1212
9.7400 110500 5.1612
9.7840 111000 5.1613
9.8281 111500 5.1072
9.8722 112000 5.0852
9.9163 112500 5.0234
9.9603 113000 5.1033

Framework Versions

  • Python: 3.12.8
  • Sentence Transformers: 3.4.1
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu126
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveTensionLoss

@inproceedings{carlsson2021semantic,
    title={Semantic Re-tuning with Contrastive Tension},
    author={Fredrik Carlsson and Amaru Cuba Gyllensten and Evangelia Gogoulou and Erik Ylip{"a}{"a} Hellqvist and Magnus Sahlgren},
    booktitle={International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=Ov_sMNau-PF}
}
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Stergios-Konstantinidis/MNLP_M3_tokenizer_tuned

Finetuned
(5)
this model

Paper for Stergios-Konstantinidis/MNLP_M3_tokenizer_tuned