Amr-h's picture
Training in progress, epoch 3, checkpoint
232ec06 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:863559
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: intfloat/multilingual-e5-base
widget:
  - source_sentence: >-
      query: ุฃุฎุจุฑู†ูŠ ุงู„ุทุจูŠุจ ุฃู† ุจุฏุงูŠุฉ ุชูƒูˆูŠู† ุงู„ุฌู†ูŠู† ูƒุงู†ุช ู…ู† ู†ูุทู’ููŽุฉู ุตุบูŠุฑุฉ ู…ู† ู…ู†ูŠ
      ุงู„ุฃุจุŒ ูˆู‡ูŠ ุจุฏุงูŠุฉ ุงู„ุฎู„ู‚ ุงู„ุนุฌูŠุจุฉ."
    sentences:
      - >-
        passage: "ูƒุงู†ุช ุงู„ุทู‡ุงุฑุฉู ุฃู…ุฑุงู‹ ู„ุงุจุฏูŽ ู…ู†ู‡ู ููŠ ุงู„ุนุตูˆุฑู ุงู„ู‚ุฏูŠู…ุฉูุŒ ุญูŠุซู
        ุชูุณุชูŽุฎุฏูŽู…ู ุงู„ู…ูŠุงู‡ู ุงู„ู…ุจุงุฑูƒุฉู ู„ุชู†ุธูŠูู ุงู„ุฒูˆุงุฑู ุญูŠู†ูŽ ู‚ุฏูˆู…ูู‡ู… ุฅู„ู‰ ุงู„ู…ุฏูŠู†ุฉู."
      - >-
        passage: "ุฃุธู‡ุฑุช ุงู„ูุญูˆุตุงุช ุฃู† ุงู„ู…ุดูƒู„ุฉ ู„ูŠุณุช ููŠ ูƒู…ูŠุฉ ูƒุจูŠุฑุฉ ู…ู† ุงู„ุณุงุฆู„ุŒ ุจู„ ููŠ
        ู†ู‚ุต ุญูŠูˆูŠุชู‡.
      - >-
        passage: "ุฃูˆุถุญ ู„ูŠ ุงู„ุทุจูŠุจ ุฃู† ู†ุดุฃุฉ ุงู„ุฌู†ูŠู† ุชุจุฏุฃ ู…ู† ู…ู†ูŠ ุงู„ุฃุจุŒ ูˆู‡ูˆ ุณุงุฆู„ ู‚ู„ูŠู„ุŒ
        ุฃุณุงุณ ุงู„ุญูŠุงุฉ ุงู„ู…ุจูƒุฑุฉ."
  - source_sentence: >-
      query: ุงูุดู’ุชูŽู‡ูŽุฑูŽ ุงูŽู„ู’ุญูŽุงูƒูู…ู ุงูŽู„ู’ุฌูŽุฏููŠุฏู ุจูู€ ุงูŽู„ู’ุนูู„ููˆูู‘ ูˆูŽุงู„ุชู‘ูŽุฌูŽุจู‘ูุฑู
      ูููŠ ุชูŽุนูŽุงู…ูู„ูู‡ู ู…ูŽุนูŽ ุงูŽู„ุดู‘ูŽุนู’ุจู ูููŠ ู‚ูŽุตู’ุฑูู‡ู ุงูŽู„ู’ููŽุฎู’ู…ู ุฃูŽู…ู’ุณู ุจูŽุนู’ุฏูŽ
      ุงูู†ู’ุชูุฎูŽุงุจูู‡ู."
    sentences:
      - >-
        passage: "ุงูุดู’ุชูŽู‡ูŽุฑูŽ ุงูŽู„ู’ุญูŽุงูƒูู…ู ุงูŽู„ู’ุฌูŽุฏููŠุฏู ุจูู€ ุงูŽู„ู’ุณู‘ูู…ููˆู‘ู
        ูˆูŽุงู„ุชู‘ูŽุนูŽุงู„ููŠ ูููŠ ู…ูุนูŽุงู…ูŽู„ูŽุชูู‡ู ู„ูู„ุฑู‘ูŽุนููŠู‘ูŽุฉู ุฏูŽุงุฎูู„ูŽ ู‚ูŽุตู’ุฑูู‡ู
        ุงูŽู„ู’ุนูŽุธููŠู…ู ุจูุงู„ู’ุฃูŽู…ู’ุณู ุจูŽุนู’ุฏูŽ ุงูู†ู’ุชูุฎูŽุงุจูู‡ู."
      - >-
        passage: "ุจูŽุฏูŽุง ุงูŽู„ู’ุญูŽุงูƒูู…ู ุงูŽู„ู’ุฌูŽุฏููŠุฏู ุจูู€ ุงูŽู„ู’ู‚ูŽุงุนู ูˆูŽุงูŽู„ุชู‘ูŽูˆูŽุงุถูุนู
        ูููŠ ุชูŽุตูŽุฑู‘ูููŽุงุชูู‡ู ู…ูŽุนูŽ ุงูŽู„ู’ู…ููˆูŽุงุทูู†ููŠู†ูŽ ูููŠ ู‚ูŽุตู’ุฑูู‡ู ุงูŽู„ู’ุจูŽุณููŠุทู
        ุจูุงู„ู’ุฃูŽู…ู’ุณู ุจูŽุนู’ุฏูŽ ุงูู†ู’ุชูุฎูŽุงุจูู‡ู.
      - >-
        passage: "ูƒุงู†ุช ุฑุญู„ุฉ ุงู„ุฃุณุฑุฉ ุจุงู„ู‚ุทุงุฑ ููŠ ุงู„ุตุจุงุญ ุงู„ุจุงูƒุฑ ู†ุญูˆ ุงู„ุฌุจุงู„ ู…ุบุงู…ุฑุฉ
        ุดูŠู‚ุฉ ูˆู…ุจู‡ุฌุฉ."
  - source_sentence: 'query: what continent is ethiopia in'
    sentences:
      - >-
        passage: ุฅุฑูŠุชุฑูŠุง (ุชูู„ูุธ /หŒษ›rแตปหˆtreษช.ษ™/ ุฃูˆ /หŒษ›rแตปหˆtriหษ™/)ุŒ ุฑุณู…ูŠู‹ุง ุฏูˆู„ุฉ
        ุฅุฑูŠุชุฑูŠุงุŒ ู‡ูŠ ุฏูˆู„ุฉ ุชู‚ุน ููŠ ุงู„ู‚ุฑู† ุงู„ุฃูุฑูŠู‚ูŠ. ุนุงุตู…ุชู‡ุง ุฃุณู…ุฑุฉุŒ ูˆุชุญุฏู‡ุง ุงู„ุณูˆุฏุงู† ู…ู†
        ุงู„ุบุฑุจุŒ ูˆุฅุซูŠูˆุจูŠุง ู…ู† ุงู„ุฌู†ูˆุจุŒ ูˆุฌูŠุจูˆุชูŠ ู…ู† ุงู„ุฌู†ูˆุจ ุงู„ุดุฑู‚ูŠ. ุชุชู…ุชุน ุงู„ุฃุฌุฒุงุก
        ุงู„ุดู…ุงู„ูŠุฉ ุงู„ุดุฑู‚ูŠุฉ ูˆุงู„ุดุฑู‚ูŠุฉ ู…ู† ุฅุฑูŠุชุฑูŠุง ุจุณุงุญู„ ุทูˆูŠู„ ุนู„ู‰ ุทูˆู„ ุงู„ุจุญุฑ ุงู„ุฃุญู…ุฑ.
      - >-
        passage: ููŠ ุงู„ุขูˆู†ุฉ ุงู„ุฃุฎูŠุฑุฉุŒ ุงุฌุชุงุญ ุงู„ู…ู†ุทู‚ุฉ ุฃุณูˆุฃ ุฌูุงู ููŠ ุชุงุฑูŠุฎ ุดุฑู‚ ุฅูุฑูŠู‚ูŠุง
        ููŠ ุนุงู… 2011ุŒ ุญูŠุซ ูุดู„ุช ู…ูˆุณู… ุงู„ุฃู…ุทุงุฑ ููŠ ุงู„ุญุฏูˆุซ ู„ู…ุฏุฉ ุนุงู…ูŠู† ู…ุชุชุงู„ูŠูŠู†. ุชุนู…ู„
        ุงู„ุญูƒูˆู…ุฉ ุญุงู„ูŠู‹ุง ุนู„ู‰ ุชุทูˆูŠุฑ ุงู„ุณูŠุงุญุฉ ู„ุฅุซูŠูˆุจูŠุง ู…ู† ุฎู„ุงู„ ุนุฏุฏ ู…ู† ุงู„ู…ุจุงุฏุฑุงุช.
      - >-
        passage: ูŠูุธู‡ุฑ ุฎุฑูŠุทุฉ ู…ูˆู‚ุน ุฅุซูŠูˆุจูŠุง ุงู„ู…ุฑูู‚ุฉ ุฃู† ุฅุซูŠูˆุจูŠุง ุชู‚ุน ููŠ ุงู„ุฌุฒุก ุงู„ุดุฑู‚ูŠ
        ู…ู† ู‚ุงุฑุฉ ุฅูุฑูŠู‚ูŠุง. ูƒู…ุง ูŠูุธู‡ุฑ ุฎุฑูŠุทุฉ ุฅุซูŠูˆุจูŠุง ุฃู† ุงู„ุจู„ุงุฏ ุชู‚ุน ุนู„ู‰ ุงู„ู‚ุฑู†
        ุงู„ุฃูุฑูŠู‚ูŠ ูˆุชุญุฏู‡ุง ุฅุฑูŠุชุฑูŠุง ู…ู† ุงู„ุดู…ุงู„ุŒ ูˆุฌูŠุจูˆุชูŠ ูˆุงู„ุตูˆู…ุงู„ ู…ู† ุงู„ุดุฑู‚ุŒ ูˆูƒูŠู†ูŠุง ู…ู†
        ุงู„ุฌู†ูˆุจุŒ ูˆุงู„ุณูˆุฏุงู† ู…ู† ุงู„ุบุฑุจ.
      - >-
        passage: ุงู†ุชู‡ุช ุญุฑุจ ุญุฏูˆุฏ ู…ุน ุฅุฑูŠุชุฑูŠุง ููŠ ุฃูˆุงุฎุฑ ุงู„ุชุณุนูŠู†ูŠุงุช ุจู…ุนุงู‡ุฏุฉ ุณู„ุงู… ููŠ
        ุฏูŠุณู…ุจุฑ 2000. ุชู… ุชุฃุฌูŠู„ ุงู„ุชุฑุณูŠู… ุงู„ู†ู‡ุงุฆูŠ ู„ู„ุญุฏูˆุฏ ุญุงู„ูŠู‹ุง ุจุณุจุจ ุงุนุชุฑุงุถุงุช
        ุฅุซูŠูˆุจูŠุง ุนู„ู‰ ะฒั‹ะฒะพะด ู„ุฌู†ุฉ ุฏูˆู„ูŠุฉ ุชุชุทู„ุจ ู…ู†ู‡ุง ุงู„ุชุฎู„ูŠ ุนู† ุฃุฑุงุถูŠ ุชุนุชุจุฑ ุญุณุงุณุฉ
        ู„ุฅุซูŠูˆุจูŠุง.
  - source_sentence: 'query: ู…ุง ู‡ูˆ ุงู„ู‡ุฏู ู…ู† ุงู„ุญูŠุงุฉุŸ'
    sentences:
      - 'passage: ู…ุง ู‡ูˆ ู…ุนู†ู‰ ุงู„ุญูŠุงุฉ ุจุงู„ู†ุณุจุฉ ู„ูƒุŸ'
      - 'passage: ู…ุง ู‡ูˆ ุญู„ู… ุญูŠุงุชูƒุŸ'
      - 'passage: ู…ุง ู‡ูˆ ุงู„ู‡ุฏู ู…ู† ูƒู„ ุดูŠุก ุฅุฐุง ูƒุงู† ูƒู„ ุดูŠุก ูŠู†ุชู‡ูŠ ุนู„ู‰ ุฃูŠ ุญุงู„ุŸ'
      - 'passage: ู…ุง ู‡ูˆ ุงู„ุบุฑุถ ุงู„ูˆุญูŠุฏ ู…ู† ุงู„ุญูŠุงุฉุŸ'
  - source_sentence: 'query: ุฑุฌู„ ูŠุญุฑู‚ ู†ูุณู‡ ููŠ ู…ุญุงูƒู…ุฉ ุจุฑูŠููŠูƒ'
    sentences:
      - 'passage: ุฑุฌู„ ูŠุญุฑู‚ ู†ูุณู‡ ุฎุงุฑุฌ ู…ุญุงูƒู…ุฉ ุจุฑูŠููŠูƒ'
      - 'passage: ุฑุฌู„ ู‚ุงู… ุจุฑู…ูŠ ู†ูุณู‡ ููŠ ุงู„ู‡ูˆุงุก'
      - 'passage: ุฑุฌู„ ูŠุฏูุฆ ู†ูุณู‡ ุจุฌุงู†ุจ ุงู„ู…ูˆู‚ุฏ'
      - 'passage: ุฑุฌู„ ูŠุญุฑู‚ ู†ูุณู‡ ููŠ ุงู„ู…ุฌู…ุน ุงู„ุชุฌุงุฑูŠ ุงู„ูˆุทู†ูŠ ููŠ ูˆุงุดู†ุทู†'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
model-index:
  - name: SentenceTransformer based on intfloat/multilingual-e5-base
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: validation eval
          type: validation_eval
        metrics:
          - type: cosine_accuracy
            value: 0.967090904712677
            name: Cosine Accuracy

SentenceTransformer based on intfloat/multilingual-e5-base

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-base on the multi_negative and triplets datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Datasets:
    • multi_negative
    • triplets

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("TawasulAI/Faheem-mE5_Base_new_data")
# Run inference
sentences = [
    'query: ุฑุฌู„ ูŠุญุฑู‚ ู†ูุณู‡ ููŠ ู…ุญุงูƒู…ุฉ ุจุฑูŠููŠูƒ',
    'passage: ุฑุฌู„ ูŠุญุฑู‚ ู†ูุณู‡ ุฎุงุฑุฌ ู…ุญุงูƒู…ุฉ ุจุฑูŠููŠูƒ',
    'passage: ุฑุฌู„ ู‚ุงู… ุจุฑู…ูŠ ู†ูุณู‡ ููŠ ุงู„ู‡ูˆุงุก',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9671

Training Details

Training Datasets

multi_negative

  • Dataset: multi_negative
  • Size: 491,698 training samples
  • Columns: query, positive, negative_1, negative_2, and negative_3
  • Approximate statistics based on the first 1000 samples:
    query positive negative_1 negative_2 negative_3
    type string string string string string
    details
    • min: 8 tokens
    • mean: 24.34 tokens
    • max: 512 tokens
    • min: 7 tokens
    • mean: 53.93 tokens
    • max: 512 tokens
    • min: 6 tokens
    • mean: 52.46 tokens
    • max: 512 tokens
    • min: 7 tokens
    • mean: 51.96 tokens
    • max: 512 tokens
    • min: 7 tokens
    • mean: 52.59 tokens
    • max: 512 tokens
  • Samples:
    query positive negative_1 negative_2 negative_3
    query: ู…ุง ู‡ูŠ ุจุนุถ ุงู„ุฃูู„ุงู… ู…ู† ุฌู…ูŠุน ุฃู†ุญุงุก ุงู„ุนุงู„ู… ุงู„ุชูŠ ุชุญุชูˆูŠ ุนู„ู‰ ู…ุดุงู‡ุฏ ุนุงุฑูŠุฉ (ุจุงุณุชุซู†ุงุก ุงู„ุฅุจุงุญูŠุฉ) ุŸ passage: ู…ุง ู‡ูŠ ุจุนุถ ุงู„ุฃูู„ุงู… ู…ู† ุฌู…ูŠุน ุฃู†ุญุงุก ุงู„ุนุงู„ู… ุงู„ุชูŠ ุชุญุชูˆูŠ ุนู„ู‰ ุจุนุถ ู…ู† ุฃูุถู„ ุงู„ู…ุดุงู‡ุฏ ุงู„ุนุงุฑูŠุฉ (ุจุงุณุชุซู†ุงุก ุงู„ุฅุจุงุญูŠุฉ) ุŸ passage: ู…ุง ู‡ูˆ ุงู„ููŠู„ู… ุงู„ุฐูŠ ูŠุญุชูˆูŠ ุนู„ู‰ ุฃูƒุซุฑ ุงู„ู…ุดุงู‡ุฏ ุงู„ุนุงุฑูŠุฉุŸ passage: ู‡ู„ ู…ู† ู…ุดุงู‡ุฏ ุฌู†ุณ ููŠ ุงู„ุฃูู„ุงู… (ุบูŠุฑ ุงู„ุฅุจุงุญูŠุฉ) ูƒุงู† ููŠู‡ุง ุงู„ู…ู…ุซู„ูŠู† ูŠู…ุงุฑุณูˆู† ุงู„ุฌู†ุณุŸ passage: ู…ุง ู‡ูŠ ุจุนุถ ู…ู† ุฃูุถู„ ุงู„ุฃูู„ุงู… ุงู„ุฌู†ุณูŠุฉ (ุฃูŠ ู„ุบุฉ) ุŸ
    query: ู…ุง ู‡ูˆ ุงู„ุฏู…ุงุบ ุงู„ุจูŠู†ูŠุŸ passage: Top 10 amazing movie makeup transformations. The diencephalon, also known as the interbrain or betweenbrain, is one of the major areas of the brain, along with the brainstem, cerebellum, and cerebrum. passage: These 10 animal facts will amaze you. The diencephalon, also known as the interbrain or betweenbrain, is one of the major areas of the brain, along with the brainstem, cerebellum, and cerebrum. This structure in the brain contains a number of smaller components of the brain which perform a variety of roles to keep the body functioning. passage: 1 The diencephalon is made up of four main components: the thalamus, the subthalamus, the hypothalamus, and the epithalamus. The hypothalamus is an integral part of the endocrine system, with one of the most important functions being to link the nervous system to the endocrine system via the pituitary gland. passage: Meronyms (parts of diencephalon): corpus mamillare; mamillary body; mammillary body (one of two small round structures on the undersurface of the brain that form the terminals of the anterior arches of the fornix) infundibulum (any of various funnel-shaped parts of the body (but especially the hypophyseal stalk))
    query: ู‡ู„ ูŠุนุชุจุฑ ุงู„ุชู‡ุงุจ ุงู„ู…ูุงุตู„ ุงู„ุฑูˆู…ุงุชูˆูŠุฏูŠ ู†ูุณ ุงู„ุชู‡ุงุจ ุงู„ู…ูุงุตู„ passage: Arthritis is an umbrella term used to describe pain, stiffness and inflammation of the joints. However, there are different kinds of arthritis, including rheumatoid arthritis (RA) and osteoarthritis (OA). Although RA and OA both affect the joints, they are very different forms of the same broader condition.Rheumatoid arthritis is an autoimmune condition, while osteoarthritis is a degenerative joint disease.lthough RA and OA both affect the joints, they are very different forms of the same broader condition. Rheumatoid arthritis is an autoimmune condition, while osteoarthritis is a degenerative joint disease. passage: Text A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining.This inflammation of the joint lining (called the synovium) can cause pain, stiffness, swelling, warmth, and redness.ext A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining. passage: Rheumatoid arthritis is a serious autoimmune disease that attacks the joints and other body parts. But RA can be tough to diagnose.Symptoms can mimic other illnesses, or they may flare, then fade, only to flare again somewhere else.Lab tests arenโ€™t perfectyou can test negative for RA factors and still have it.heumatoid arthritis is a serious autoimmune disease that attacks the joints and other body parts. But RA can be tough to diagnose. passage: Share +. Text A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining. This inflammation of the joint lining (called the synovium) can cause pain, stiffness, swelling, warmth, and redness.ext A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

triplets

  • Dataset: triplets
  • Size: 371,861 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 18 tokens
    • mean: 53.76 tokens
    • max: 150 tokens
    • min: 9 tokens
    • mean: 55.48 tokens
    • max: 159 tokens
    • min: 5 tokens
    • mean: 51.11 tokens
    • max: 166 tokens
  • Samples:
    anchor positive negative
    query: ู„ู‚ุฏ ู‚ุฏู… ุงู„ุจุงุญุซ ู…ูุณู’ุชูŽุทู’ุฑูŽูู‹ุง ู…ู† ุงู„ุฃุจุญุงุซ ููŠ ุงู„ู…ุคุชู…ุฑุŒ ุญูŠุซ ุนุฑุถ ู†ุชุงุฆุฌ ุบูŠุฑ ู…ุฃู„ูˆูุฉ ุฃุซุงุฑุช ุฌุฏู„ุงู‹ ูˆุงุณุนุงู‹ ุจูŠู† ุงู„ุญุถูˆุฑ ุงู„ุจุงุฑุญุฉ." passage: "ู„ู‚ุฏ ู‚ุฏู… ุงู„ุจุงุญุซ ุฌุฏูŠุฏู‹ุง ู…ู† ุงู„ุฏุฑุงุณุงุช ููŠ ุงู„ู…ู„ุชู‚ู‰ุŒ ุฅุฐ ุจูŠู† ู…ุนุทูŠุงุช ุบูŠุฑ ุชู‚ู„ูŠุฏูŠุฉ ุฃุซุงุฑุช ู†ู‚ุงุดู‹ุง ู…ุณุชููŠุถู‹ุง ุจูŠู† ุงู„ู…ุดุงุฑูƒูŠู† ุจุงู„ุฃู…ุณ." passage: "ู„ู‚ุฏ ู‚ุฏู… ุงู„ุจุงุญุซ ู‚ุฏูŠู…ู‹ุง ู…ู† ุงู„ุฃุจุญุงุซ ููŠ ุงู„ู†ุฏูˆุฉุŒ ุญูŠุซ ุนุฑุถ ู…ุนู„ูˆู…ุงุช ู…ุฃู„ูˆูุฉ ู„ู… ุชุซุฑ ุฃูŠ ุญูˆุงุฑ ุจูŠู† ุงู„ุญุงุถุฑูŠู† ู‚ุจู„ ูŠูˆู….
    query: ุจุนุฏ ูŠูˆู… ุญุงุฑุŒ ุงุบุชุณู„ุช ุจู…ุงุก ุฒูู„ุงู„ ูƒุงู† ูŠุฌุฑูŠ ููŠ ุงู„ุฌุฏูˆู„ ุงู„ุตุบูŠุฑ ุจุฌุงู†ุจ ุงู„ุญู‚ู„. passage: ุจุนุฏ ูŠูˆู… ู‚ุงุฆุธุŒ ุชุทู‡ุฑุช ุจู…ุงุก ููุฑูŽุงุช ูƒุงู† ูŠุชุฏูู‚ ููŠ ุงู„ู†ู‡ุฑ ุงู„ุถูŠู‚ ู‚ุฑุจ ุงู„ู…ุฒุฑุนุฉ. passage: ุจุนุฏ ูŠูˆู… ู…ุดู…ุณุŒ ุชู„ุทุฎุช ุจู…ุงุก ุฃูŽุฌูู†ู‘ ูƒุงู† ุฑุงูƒุฏุง ููŠ ุงู„ุจุฑูƒุฉ ู‚ุฑุจ ุงู„ู…ุฑุนู‰.
    query: ุฃูŽูˆู’ู‚ูŽุฏูŽ ุงู„ู…ูุฎูŽูŠู‘ูู…ููˆู†ูŽ ู†ูŽุงุฑูŽ ุงู„ู…ูŽุฎููŠู…ู ุจูุญูŽู…ูŽุงุณูŽุฉู ู‚ูุจูŽูŠู’ู„ูŽ ุงู„ู„ูŽูŠู’ู„ู ู„ูุทูŽู‡ู’ูŠู ุงู„ุนูŽุดูŽุงุกู." passage: "ุฃูŽุซูŽุงุฑูŽ ุงู„ู…ูุฎูŽูŠู‘ูู…ููˆู†ูŽ ู„ูŽู‡ูŽุจูŽ ุงู„ู…ูŽูˆู’ู‚ูุฏู ุจูุดูŽุบูŽูู ู‚ูŽุจู’ู„ูŽ ู…ูŽุฌููŠุกู ุงู„ู„ู‘ูŽูŠู’ู„ู ู„ูุฅูุนู’ุฏูŽุงุฏู ุงู„ู’ุนูŽุดูŽุงุก." passage: "ุฃูŽุทู’ููŽุฃูŽ ุงู„ู…ูุฎูŽูŠู‘ูู…ููˆู†ูŽ ู†ูŽุงุฑูŽ ุงู„ู…ูŽุฎููŠู…ู ุจูŽุนู’ุฏูŽ ุงู„ู’ุนูŽุดูŽุงุกู ู„ูู„ู’ุฎูŽู„ููˆุฏู ุฅูู„ูŽู‰ ุงู„ู†ู‘ูŽูˆู’ู…ู.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 4e-05
  • weight_decay: 0.01
  • max_grad_norm: 2.0
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • optim: adamw_8bit
  • push_to_hub: True
  • hub_model_id: TawasulAI/Faheem-mE5_Base_NLI
  • hub_strategy: checkpoint

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 4e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 2.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_8bit
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: TawasulAI/Faheem-mE5_Base_NLI
  • hub_strategy: checkpoint
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss validation_eval_cosine_accuracy
None 0 - 0.9376
1.0 3374 22.5359 0.9636
2.0 6748 12.387 0.9669
3.0 10122 8.596 0.9671

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.7.0+cu128
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}