metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:262023
- loss:MultipleNegativesRankingLoss
base_model: intfloat/e5-base-v2
widget:
- source_sentence: 'query: Sin in the Bible'
sentences:
- >-
passage: but each person is tempted when they are dragged away by their
own evil desire and enticed.
- >-
passage: If they want to inquire about something, they should ask their
own husbands at home; for it is disgraceful for a woman to speak in the
church.
- >-
passage: The crowds that went ahead of him and those that followed
shouted,
“Hosanna to the Son of David!”
“Blessed is he who comes in the name of the Lord!”
“Hosanna in the highest heaven!”
- source_sentence: 'query: Naphtali in the Bible'
sentences:
- |-
passage: About Naphtali he said:
“Naphtali is abounding with the favor of the Lord
and is full of his blessing;
he will inherit southward to the lake.”
- |-
passage: You have enlarged the nation, Lord;
you have enlarged the nation.
You have gained glory for yourself;
you have extended all the borders of the land.
- >-
passage: For Herod himself had given orders to have John arrested, and
he had him bound and put in prison. He did this because of Herodias, his
brother Philip’s wife, whom he had married.
- source_sentence: 'query: Ten Commandments Given in the Bible'
sentences:
- >-
passage: As they were shouting and throwing off their cloaks and
flinging dust into the air,
- >-
passage: On the first day of the third month after the Israelites left
Egypt—on that very day—they came to the Desert of Sinai.
- |-
passage: Blessed are the meek,
for they will inherit the earth.
- source_sentence: >-
query: But Nahash the Ammonite replied, “I will make a treaty with you
only on the condition that I gouge out the right eye of every one of you
and so bring disgrace on all Israel.”
sentences:
- >-
passage: A certain man in Maon, who had property there at Carmel, was
very wealthy. He had a thousand goats and three thousand sheep, which he
was shearing in Carmel.
- >-
passage: Nor did Asher drive out those living in Akko or Sidon or Ahlab
or Akzib or Helbah or Aphek or Rehob.
- >-
passage: The elders of Jabesh said to him, “Give us seven days so we can
send messengers throughout Israel; if no one comes to rescue us, we will
surrender to you.”
- source_sentence: 'query: who is Ephraim'
sentences:
- |-
passage: Those who guide this people mislead them,
and those who are guided are led astray.
- |-
passage: No longer will they teach their neighbor,
or say to one another, ‘Know the Lord,’
because they will all know me,
from the least of them to the greatest.
- >-
passage: But a man of God came to him and said, “Your Majesty, these
troops from Israel must not march with you, for the Lord is not with
Israel—not with any of the people of Ephraim.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on intfloat/e5-base-v2
This is a sentence-transformers model finetuned from intfloat/e5-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: intfloat/e5-base-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'query: who is Ephraim',
'passage: But a man of God came to him and said, “Your Majesty, these troops from Israel must not march with you, for the Lord is not with Israel—not with any of the people of Ephraim.',
'passage: Those who guide this people mislead them,\n and those who are guided are led astray.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7104, 0.2667],
# [0.7104, 1.0000, 0.3225],
# [0.2667, 0.3225, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 262,023 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 5 tokens
- mean: 26.25 tokens
- max: 256 tokens
- min: 10 tokens
- mean: 35.61 tokens
- max: 94 tokens
- min: 1.0
- mean: 1.0
- max: 1.0
- Samples:
sentence_0 sentence_1 label query: God: (A.S. and Dutch God; Dan. Gud; Ger. Gott), the name of the Divine Being. It is the rendering (1) of the Hebrew 'El , from a word meaning to be strong; (2) of 'Eloah_, plural _'Elohim . The singular form, Eloah , is used only in poetry. The plural form is more commonly used in all parts of the Bible, The Hebrew word Jehovah (q.v.), the only other word generally employed to denote the Supreme Being, is uniformly rendered in the Authorized Version by "LORD," printed in small capitals. The existence of God is taken for granted in the Bible. There is nowhere any argument to prove it. He who disbelieves this truth is spoken of as one devoid of understanding ( Psalms 14:1 ). The arguments generally adduced by theologians in proof of the being of God are:- The a priori argument, which is the testimony afforded by reason.
The a posteriori argument, by which we proceed logically from the facts of experience to causes. These arguments are, ...passage: But the Lord forbid that I should lay a hand on the Lord’s anointed. Now get the spear and water jug that are near his head, and let’s go.”1.0query: Predestination meaningpassage: From one man he made all the nations, that they should inhabit the whole earth; and he marked out their appointed times in history and the boundaries of their lands.1.0query: Noah and Godpassage: And Noah did all that the Lord commanded him.1.0 - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 32per_device_eval_batch_size: 32num_train_epochs: 1max_steps: 200multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: 200lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Framework Versions
- Python: 3.11.14
- Sentence Transformers: 5.2.0
- Transformers: 4.57.6
- PyTorch: 2.10.0+cpu
- Accelerate: 1.12.0
- Datasets: 4.5.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}