Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from yoriis/NAMAA-retriever-contrastive-2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yoriis/NAMAA-retriever-contrastive-final")
# Run inference
sentences = [
'ما الأحوال التي يسقط فيها استقبال القبلة؟',
'أحسب الناس أن يتركوا أن يقولوا آمنا وهم لا يفتنون{2} ولقد فتنا الذين من قبلهم فليعلمن الله الذين صدقوا وليعلمن الكاذبين{3} العنكبوت',
'وقفينا على آثارهم بعيسى ابن مريم مصدقا لما بين يديه من التوراة وآتيناه الإنجيل فيه هدى ونور ومصدقا لما بين يديه من التوراة وهدى وموعظة للمتقين {46}المائدة',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
متى تكون التوبة غير مقبولة ؟ |
عن أنس رضي الله عنه أن رسول الله ﷺ قال: (إذا قُدِّم العَشَاءُ فابدؤوا به قبل أن تصلّوا المغربَ). متفق عليه |
0.0 |
ما حكم قول حي على خير العمل في الأذان ؟ |
جَابِرٍ رضي الله عنه، عَنِ النَّبِيِّ ﷺ قَالَ: «إِذَا كَانَ جُنْحُ اللَّيْلِ، أَوْ أَمْسَيْتُمْ، فَكُفُّوا صِبْيَانَكُمْ، فَإِنَّ الشَّيَاطِينَ تَنْتَشِرُ حِينَئِذٍ، فَإِذَا ذَهَبَ سَاعَةٌ مِنَ اللَّيْلِ فَحُلُّوهُمْ، فَأَغْلِقُوا الأَبْوَابَ وَاذْكُرُوا اسْمَ الله، فَإِنَّ الشَّيْطَانَ لاَ يَفْتَحُ بَابًا مُغْلَقًا، وَأَوْكُوا قِرَبَكُمْ وَاذْكُرُوا اسْمَ الله، وَخَمِّرُوا آنِيَتَكُمْ وَاذْكُرُوا اسْمَ الله، وَلَوْ أَنْ تَعْرُضُوا عَلَيْهَا شَيْئًا، وَأَطْفِئُوا مَصَابِيحَكُمْ». رواه البخاري (5623)، ومسلم (2012). |
0.0 |
من هو آخر الأنبياء ؟ |
حديث عَائِشَةَ رضي الله عنها، قَالَتْ: أَقْبَلَتْ فَاطِمَةُ تَمْشِي كَأَنَّ مِشْيَتَهَا مَشْيُ النَّبِيِّ ﷺ، فَقَالَ النَّبِيُّ ﷺ: «أَمَا تَرْضَيْنَ أَنْ تَكُونِي سَيِّدَةَ نِسَاءِ أَهْلِ الجَنَّةِ». رواه البخاري (3624)، ومسلم(2450) |
0.0 |
ContrastiveLoss with these parameters:{
"distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
"margin": 0.5,
"size_average": true
}
per_device_train_batch_size: 16per_device_eval_batch_size: 16fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss |
|---|---|---|
| 0.5931 | 500 | 0.0135 |
| 1.1862 | 1000 | 0.0094 |
| 1.7794 | 1500 | 0.0063 |
| 2.3725 | 2000 | 0.0045 |
| 2.9656 | 2500 | 0.0036 |
| 0.8347 | 500 | 0.0081 |
| 1.6694 | 1000 | 0.0039 |
| 2.5042 | 1500 | 0.0025 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}
Base model
NAMAA-Space/AraModernBert-Base-V1.0