metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:193
- loss:CosineSimilarityLoss
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
widget:
- source_sentence: I saw someone killing a cat in the street, I felt helpless and sad
sentences:
- >-
There is no god ?worthy of worship? except You. Glory be to You! I have
certainly done wrong.
- >-
who say, when struck by a disaster, Surely to Allah we belong and to
Him we will ?all? return.
- >-
And never think that Allah is unaware of what the wrongdoers do. He only
delays them for a Day when eyes will stare [in horror]
- source_sentence: I am really sad, I hate my life and I wanna suicide
sentences:
- >-
And never think that Allah is unaware of what the wrongdoers do. He only
delays them for a Day when eyes will stare [in horror]
- And when the ignorant address them, they say words of peace
- >-
And seek help through patience and prayer. Indeed, it is a burden except
for the humble
- source_sentence: 'my cousin just died '
sentences:
- >-
who say, when struck by a disaster, Surely to Allah we belong and to
Him we will ?all? return.
- >-
Again, no! Never obey him ?O Prophet?! Rather, ?continue to? prostrate
and draw near ?to Allah?.
- Do not do a favour expecting more ?in return?.
- source_sentence: tell me about peace
sentences:
- >-
O mankind, eat from whatever is on earth [that is] lawful and good and
do not follow the footsteps of Satan. Indeed, he is to you a clear enemy
- And when the ignorant address them, they say words of peace
- >-
And if you divorce them before consummating the marriage but after
deciding on a dowry, pay half of the dowry, unless the wife graciously
waives it or the husband graciously pays in full. Graciousness is closer
to righteousness. And do not forget kindness among yourselves. Surely
Allah is All-Seeing of what you do.
- source_sentence: I lost my friend, he died and I miss him
sentences:
- >-
Not equal are the good deed and the bad deed. Repel [evil] by that
[deed] which is better; and thereupon the one whom between you and him
is enmity [will become] as though he was a devoted friend
- >-
Every soul will taste death. And you will only receive your full reward
on the Day of Judgment. Whoever is spared from the Fire and is admitted
into Paradise will ?indeed? triumph, whereas the life of this world is
no more than the delusion of enjoyment.
- Every soul will taste death, then to Us you will ?all? be returned.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'I lost my friend, he died and I miss him',
'Every soul will taste death. And you will only receive your full reward on the Day of Judgment. Whoever is spared from the Fire and is admitted into Paradise will ?indeed? triumph, whereas the life of this world is no more than the delusion of enjoyment.',
'Every soul will taste death, then to Us you will ?all? be returned.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9072, 0.9224],
# [0.9072, 1.0000, 0.9847],
# [0.9224, 0.9847, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 193 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 193 samples:
sentence_0 sentence_1 label type string string float details - min: 5 tokens
- mean: 12.27 tokens
- max: 34 tokens
- min: 3 tokens
- mean: 39.33 tokens
- max: 128 tokens
- min: 0.0
- mean: 0.9
- max: 1.0
- Samples:
sentence_0 sentence_1 label I am afraid that my son is not in the right wayAnd those who say: Our Lord! Grant us comfort in our spouses and our offspring, and make us leaders of the righteous1.0my cat just diedAnd We will surely test you with something of fear and hunger and a loss of wealth and lives and fruits, but give good tidings to the patient1.0I do not have childreAnd those who say: Our Lord! Grant us comfort in our spouses and our offspring, and make us leaders of the righteous1.0 - Loss:
CosineSimilarityLosswith these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
num_train_epochs: 10multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Framework Versions
- Python: 3.12.7
- Sentence Transformers: 5.1.1
- Transformers: 4.57.1
- PyTorch: 2.5.1
- Accelerate: 1.11.0
- Datasets: 4.3.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}