hf-e5-bible-100 / README.md
dpshade22's picture
Upload hf-e5-bible-100 embedding model
464c3b0 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:262023
  - loss:MultipleNegativesRankingLoss
base_model: intfloat/e5-base-v2
widget:
  - source_sentence: >-
      query: Handkerchief: Only once in Authorized Version (Acts 19:12). The
      Greek word (sudarion) so rendered means properly “a sweat-cloth.” It is
      rendered “napkin” in John 11:44; 20:7; Luke 19:20.
    sentences:
      - >-
        passage: as well as the cloth that had been wrapped around Jesus’ head.
        The cloth was still lying in its place, separate from the linen.
      - >-
        passage: “On that day I will make the clans of Judah like a firepot in a
        woodpile, like a flaming torch among sheaves. They will consume all the
        surrounding peoples right and left, but Jerusalem will remain intact in
        her place.
      - >-
        passage: and the borders of Canaan reached from Sidon toward Gerar as
        far as Gaza, and then toward Sodom, Gomorrah, Admah and Zeboyim, as far
        as Lasha.
  - source_sentence: 'query: what happened to Job'
    sentences:
      - |-
        passage: Remember, O God, that my life is but a breath;
            my eyes will never see happiness again.
      - >-
        passage: So he prepared a great feast for them, and after they had
        finished eating and drinking, he sent them away, and they returned to
        their master. So the bands from Aram stopped raiding Israel’s territory.
      - 'passage: of Ater (through Hezekiah) 98'
  - source_sentence: 'query: what happened to Jesus'
    sentences:
      - >-
        passage: The Lord wrote on these tablets what he had written before, the
        Ten Commandments he had proclaimed to you on the mountain, out of the
        fire, on the day of the assembly. And the Lord gave them to me.
      - >-
        passage: “Make a tree good and its fruit will be good, or make a tree
        bad and its fruit will be bad, for a tree is recognized by its fruit.
      - >-
        passage: So Joshua and his whole army came against them suddenly at the
        Waters of Merom and attacked them,
  - source_sentence: 'query: what is Games'
    sentences:
      - >-
        passage: In Hebron he reigned over Judah seven years and six months, and
        in Jerusalem he reigned over all Israel and Judah thirty-three years.
      - >-
        passage: Their surrounding villages were Etam, Ain, Rimmon, Token and
        Ashan—five towns—
      - >-
        passage: Fight the good fight of the faith. Take hold of the eternal
        life to which you were called when you made your good confession in the
        presence of many witnesses.
  - source_sentence: >-
      query: God:  (A.S. and Dutch God; Dan. Gud; Ger. Gott), the name of the
      Divine Being. It is the rendering (1) of the Hebrew <i> 'El</i> , from a
      word meaning to be strong; (2) of <i> 'Eloah_, plural _'Elohim</i> . The
      singular form, <i> Eloah</i> , is used only in poetry. The plural form is
      more commonly used in all parts of the Bible, The Hebrew word Jehovah
      (q.v.), the only other word generally employed to denote the Supreme
      Being, is uniformly rendered in the Authorized Version by "LORD," printed
      in small capitals. The existence of God is taken for granted in the Bible.
      There is nowhere any argument to prove it. He who disbelieves this truth
      is spoken of as one devoid of understanding (  Psalms 14:1  ).    The
      arguments generally adduced by theologians in proof of the being of God
      are:   <li> The a priori argument, which is the testimony afforded by
      reason.    <li> The a posteriori argument, by which we proceed logically
      from the facts of experience to causes. These arguments are,    (a) The
      cosmological, by which it is proved that there must be a First Cause of
      all things, for every effect must have a cause.   (b) The teleological, or
      the argument from design. We see everywhere the operations of an
      intelligent Cause in nature.   (c) The moral argument, called also the
      anthropological argument, based on the moral consciousness and the history
      of mankind, which exhibits a moral order and purpose which can only be
      explained on the supposition of the existence of God. Conscience and human
      history testify that "verily there is a God that judgeth in the earth."  
      The attributes of God are set forth in order by Moses in   Exodus 34:6  
      Exodus 34:7  . (see also   Deuteronomy 6:4  ;   10:17  ;   Numbers 16:22 
      ;   Exodus 15:11  ;   33:19  ;   Isaiah 44:6  ;   Habakkuk 3:6  ;   Psalms
      102:26  ;   Job 34:12  .) They are also systematically classified in  
      Revelation 5:12   and   7:12  .    God's attributes are spoken of by some
      as absolute, i.e., such as belong to his essence as Jehovah, Jah, etc.;
      and relative, i.e., such as are ascribed to him with relation to his
      creatures. Others distinguish them into communicable, i.e., those which
      can be imparted in degree to his creatures: goodness, holiness, wisdom,
      etc.; and incommunicable, which cannot be so imparted: independence,
      immutability, immensity, and eternity. They are by some also divided into
      natural attributes, eternity, immensity, etc.; and moral, holiness,
      goodness, etc.
    sentences:
      - >-
        passage: Then each man grabbed his opponent by the head and thrust his
        dagger into his opponent’s side, and they fell down together. So that
        place in Gibeon was called Helkath Hazzurim.
      - >-
        passage: and I saw the glory of the God of Israel coming from the east.
        His voice was like the roar of rushing waters, and the land was radiant
        with his glory.
      - |-
        passage: How long, Lord, must I call for help,
            but you do not listen?
        Or cry out to you, “Violence!”
            but you do not save?
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on intfloat/e5-base-v2

This is a sentence-transformers model finetuned from intfloat/e5-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/e5-base-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'query: God:  (A.S. and Dutch God; Dan. Gud; Ger. Gott), the name of the Divine Being. It is the rendering (1) of the Hebrew <i> \'El</i> , from a word meaning to be strong; (2) of <i> \'Eloah_, plural _\'Elohim</i> . The singular form, <i> Eloah</i> , is used only in poetry. The plural form is more commonly used in all parts of the Bible, The Hebrew word Jehovah (q.v.), the only other word generally employed to denote the Supreme Being, is uniformly rendered in the Authorized Version by "LORD," printed in small capitals. The existence of God is taken for granted in the Bible. There is nowhere any argument to prove it. He who disbelieves this truth is spoken of as one devoid of understanding (  Psalms 14:1  ).    The arguments generally adduced by theologians in proof of the being of God are:   <li> The a priori argument, which is the testimony afforded by reason.    <li> The a posteriori argument, by which we proceed logically from the facts of experience to causes. These arguments are,    (a) The cosmological, by which it is proved that there must be a First Cause of all things, for every effect must have a cause.   (b) The teleological, or the argument from design. We see everywhere the operations of an intelligent Cause in nature.   (c) The moral argument, called also the anthropological argument, based on the moral consciousness and the history of mankind, which exhibits a moral order and purpose which can only be explained on the supposition of the existence of God. Conscience and human history testify that "verily there is a God that judgeth in the earth."   The attributes of God are set forth in order by Moses in   Exodus 34:6   Exodus 34:7  . (see also   Deuteronomy 6:4  ;   10:17  ;   Numbers 16:22  ;   Exodus 15:11  ;   33:19  ;   Isaiah 44:6  ;   Habakkuk 3:6  ;   Psalms 102:26  ;   Job 34:12  .) They are also systematically classified in   Revelation 5:12   and   7:12  .    God\'s attributes are spoken of by some as absolute, i.e., such as belong to his essence as Jehovah, Jah, etc.; and relative, i.e., such as are ascribed to him with relation to his creatures. Others distinguish them into communicable, i.e., those which can be imparted in degree to his creatures: goodness, holiness, wisdom, etc.; and incommunicable, which cannot be so imparted: independence, immutability, immensity, and eternity. They are by some also divided into natural attributes, eternity, immensity, etc.; and moral, holiness, goodness, etc.',
    'passage: How long, Lord, must I call for help,\n    but you do not listen?\nOr cry out to you, “Violence!”\n    but you do not save?',
    'passage: Then each man grabbed his opponent by the head and thrust his dagger into his opponent’s side, and they fell down together. So that place in Gibeon was called Helkath Hazzurim.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.4670, 0.3140],
#         [0.4670, 1.0000, 0.4137],
#         [0.3140, 0.4137, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 262,023 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 5 tokens
    • mean: 27.82 tokens
    • max: 256 tokens
    • min: 9 tokens
    • mean: 35.93 tokens
    • max: 87 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    query: To those who sold doves he said, “Get these out of here! Stop turning my Father’s house into a market!” passage: His disciples remembered that it is written: “Zeal for your house will consume me.” 1.0
    query: Joseph (son of Jacob) passage: Joseph found favor in his eyes and became his attendant. Potiphar put him in charge of his household, and he entrusted to his care everything he owned. 1.0
    query: Divination meaning passage: He sacrificed his children in the fire in the Valley of Ben Hinnom, practiced divination and witchcraft, sought omens, and consulted mediums and spiritists. He did much evil in the eyes of the Lord, arousing his anger. 1.0
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 1
  • max_steps: 100
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: 100
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Framework Versions

  • Python: 3.11.14
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.6
  • PyTorch: 2.10.0+cpu
  • Accelerate: 1.12.0
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}