metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:863559
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: intfloat/multilingual-e5-base
widget:
- source_sentence: >-
query: ุฃุฎุจุฑูู ุงูุทุจูุจ ุฃู ุจุฏุงูุฉ ุชูููู ุงูุฌููู ูุงูุช ู
ู ููุทูููุฉู ุตุบูุฑุฉ ู
ู ู
ูู
ุงูุฃุจุ ููู ุจุฏุงูุฉ ุงูุฎูู ุงูุนุฌูุจุฉ."
sentences:
- >-
passage: "ูุงูุช ุงูุทูุงุฑุฉู ุฃู
ุฑุงู ูุงุจุฏู ู
ููู ูู ุงูุนุตูุฑู ุงููุฏูู
ุฉูุ ุญูุซู
ุชูุณุชูุฎุฏูู
ู ุงูู
ูุงูู ุงูู
ุจุงุฑูุฉู ูุชูุธููู ุงูุฒูุงุฑู ุญููู ูุฏูู
ููู
ุฅูู ุงูู
ุฏููุฉู."
- >-
passage: "ุฃุธูุฑุช ุงููุญูุตุงุช ุฃู ุงูู
ุดููุฉ ููุณุช ูู ูู
ูุฉ ูุจูุฑุฉ ู
ู ุงูุณุงุฆูุ ุจู ูู
ููุต ุญูููุชู.
- >-
passage: "ุฃูุถุญ ูู ุงูุทุจูุจ ุฃู ูุดุฃุฉ ุงูุฌููู ุชุจุฏุฃ ู
ู ู
ูู ุงูุฃุจุ ููู ุณุงุฆู ููููุ
ุฃุณุงุณ ุงูุญูุงุฉ ุงูู
ุจูุฑุฉ."
- source_sentence: >-
query: ุงูุดูุชูููุฑู ุงูููุญูุงููู
ู ุงูููุฌูุฏููุฏู ุจูู ุงูููุนูููููู ููุงูุชููุฌูุจููุฑู
ููู ุชูุนูุงู
ููููู ู
ูุนู ุงููุดููุนูุจู ููู ููุตูุฑููู ุงูููููุฎูู
ู ุฃูู
ูุณู ุจูุนูุฏู
ุงูููุชูุฎูุงุจููู."
sentences:
- >-
passage: "ุงูุดูุชูููุฑู ุงูููุญูุงููู
ู ุงูููุฌูุฏููุฏู ุจูู ุงูููุณููู
ูููู
ููุงูุชููุนูุงููู ููู ู
ูุนูุงู
ูููุชููู ูููุฑููุนููููุฉู ุฏูุงุฎููู ููุตูุฑููู
ุงูููุนูุธููู
ู ุจูุงููุฃูู
ูุณู ุจูุนูุฏู ุงูููุชูุฎูุงุจููู."
- >-
passage: "ุจูุฏูุง ุงูููุญูุงููู
ู ุงูููุฌูุฏููุฏู ุจูู ุงูููููุงุนู ููุงููุชููููุงุถูุนู
ููู ุชูุตูุฑููููุงุชููู ู
ูุนู ุงูููู
ูููุงุทูููููู ููู ููุตูุฑููู ุงูููุจูุณููุทู
ุจูุงููุฃูู
ูุณู ุจูุนูุฏู ุงูููุชูุฎูุงุจููู.
- >-
passage: "ูุงูุช ุฑุญูุฉ ุงูุฃุณุฑุฉ ุจุงููุทุงุฑ ูู ุงูุตุจุงุญ ุงูุจุงูุฑ ูุญู ุงูุฌุจุงู ู
ุบุงู
ุฑุฉ
ุดููุฉ ูู
ุจูุฌุฉ."
- source_sentence: 'query: what continent is ethiopia in'
sentences:
- >-
passage: ุฅุฑูุชุฑูุง (ุชูููุธ /หษrแตปหtreษช.ษ/ ุฃู /หษrแตปหtriหษ/)ุ ุฑุณู
ููุง ุฏููุฉ
ุฅุฑูุชุฑูุงุ ูู ุฏููุฉ ุชูุน ูู ุงููุฑู ุงูุฃูุฑููู. ุนุงุตู
ุชูุง ุฃุณู
ุฑุฉุ ูุชุญุฏูุง ุงูุณูุฏุงู ู
ู
ุงูุบุฑุจุ ูุฅุซููุจูุง ู
ู ุงูุฌููุจุ ูุฌูุจูุชู ู
ู ุงูุฌููุจ ุงูุดุฑูู. ุชุชู
ุชุน ุงูุฃุฌุฒุงุก
ุงูุดู
ุงููุฉ ุงูุดุฑููุฉ ูุงูุดุฑููุฉ ู
ู ุฅุฑูุชุฑูุง ุจุณุงุญู ุทููู ุนูู ุทูู ุงูุจุญุฑ ุงูุฃุญู
ุฑ.
- >-
passage: ูู ุงูุขููุฉ ุงูุฃุฎูุฑุฉุ ุงุฌุชุงุญ ุงูู
ูุทูุฉ ุฃุณูุฃ ุฌูุงู ูู ุชุงุฑูุฎ ุดุฑู ุฅูุฑูููุง
ูู ุนุงู
2011ุ ุญูุซ ูุดูุช ู
ูุณู
ุงูุฃู
ุทุงุฑ ูู ุงูุญุฏูุซ ูู
ุฏุฉ ุนุงู
ูู ู
ุชุชุงูููู. ุชุนู
ู
ุงูุญููู
ุฉ ุญุงูููุง ุนูู ุชุทููุฑ ุงูุณูุงุญุฉ ูุฅุซููุจูุง ู
ู ุฎูุงู ุนุฏุฏ ู
ู ุงูู
ุจุงุฏุฑุงุช.
- >-
passage: ููุธูุฑ ุฎุฑูุทุฉ ู
ููุน ุฅุซููุจูุง ุงูู
ุฑููุฉ ุฃู ุฅุซููุจูุง ุชูุน ูู ุงูุฌุฒุก ุงูุดุฑูู
ู
ู ูุงุฑุฉ ุฅูุฑูููุง. ูู
ุง ููุธูุฑ ุฎุฑูุทุฉ ุฅุซููุจูุง ุฃู ุงูุจูุงุฏ ุชูุน ุนูู ุงููุฑู
ุงูุฃูุฑููู ูุชุญุฏูุง ุฅุฑูุชุฑูุง ู
ู ุงูุดู
ุงูุ ูุฌูุจูุชู ูุงูุตูู
ุงู ู
ู ุงูุดุฑูุ ูููููุง ู
ู
ุงูุฌููุจุ ูุงูุณูุฏุงู ู
ู ุงูุบุฑุจ.
- >-
passage: ุงูุชูุช ุญุฑุจ ุญุฏูุฏ ู
ุน ุฅุฑูุชุฑูุง ูู ุฃูุงุฎุฑ ุงูุชุณุนูููุงุช ุจู
ุนุงูุฏุฉ ุณูุงู
ูู
ุฏูุณู
ุจุฑ 2000. ุชู
ุชุฃุฌูู ุงูุชุฑุณูู
ุงูููุงุฆู ููุญุฏูุฏ ุญุงูููุง ุจุณุจุจ ุงุนุชุฑุงุถุงุช
ุฅุซููุจูุง ุนูู ะฒัะฒะพะด ูุฌูุฉ ุฏูููุฉ ุชุชุทูุจ ู
ููุง ุงูุชุฎูู ุนู ุฃุฑุงุถู ุชุนุชุจุฑ ุญุณุงุณุฉ
ูุฅุซููุจูุง.
- source_sentence: 'query: ู
ุง ูู ุงููุฏู ู
ู ุงูุญูุงุฉุ'
sentences:
- 'passage: ู
ุง ูู ู
ุนูู ุงูุญูุงุฉ ุจุงููุณุจุฉ ููุ'
- 'passage: ู
ุง ูู ุญูู
ุญูุงุชูุ'
- 'passage: ู
ุง ูู ุงููุฏู ู
ู ูู ุดูุก ุฅุฐุง ูุงู ูู ุดูุก ููุชูู ุนูู ุฃู ุญุงูุ'
- 'passage: ู
ุง ูู ุงูุบุฑุถ ุงููุญูุฏ ู
ู ุงูุญูุงุฉุ'
- source_sentence: 'query: ุฑุฌู ูุญุฑู ููุณู ูู ู
ุญุงูู
ุฉ ุจุฑูููู'
sentences:
- 'passage: ุฑุฌู ูุญุฑู ููุณู ุฎุงุฑุฌ ู
ุญุงูู
ุฉ ุจุฑูููู'
- 'passage: ุฑุฌู ูุงู
ุจุฑู
ู ููุณู ูู ุงูููุงุก'
- 'passage: ุฑุฌู ูุฏูุฆ ููุณู ุจุฌุงูุจ ุงูู
ููุฏ'
- 'passage: ุฑุฌู ูุญุฑู ููุณู ูู ุงูู
ุฌู
ุน ุงูุชุฌุงุฑู ุงููุทูู ูู ูุงุดูุทู'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on intfloat/multilingual-e5-base
results:
- task:
type: triplet
name: Triplet
dataset:
name: validation eval
type: validation_eval
metrics:
- type: cosine_accuracy
value: 0.967090904712677
name: Cosine Accuracy
SentenceTransformer based on intfloat/multilingual-e5-base
This is a sentence-transformers model finetuned from intfloat/multilingual-e5-base on the multi_negative and triplets datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: intfloat/multilingual-e5-base
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Datasets:
- multi_negative
- triplets
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the ๐ค Hub
model = SentenceTransformer("TawasulAI/Faheem-mE5_Base_new_data")
# Run inference
sentences = [
'query: ุฑุฌู ูุญุฑู ููุณู ูู ู
ุญุงูู
ุฉ ุจุฑูููู',
'passage: ุฑุฌู ูุญุฑู ููุณู ุฎุงุฑุฌ ู
ุญุงูู
ุฉ ุจุฑูููู',
'passage: ุฑุฌู ูุงู
ุจุฑู
ู ููุณู ูู ุงูููุงุก',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
validation_eval - Evaluated with
TripletEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.9671 |
Training Details
Training Datasets
multi_negative
- Dataset: multi_negative
- Size: 491,698 training samples
- Columns:
query,positive,negative_1,negative_2, andnegative_3 - Approximate statistics based on the first 1000 samples:
query positive negative_1 negative_2 negative_3 type string string string string string details - min: 8 tokens
- mean: 24.34 tokens
- max: 512 tokens
- min: 7 tokens
- mean: 53.93 tokens
- max: 512 tokens
- min: 6 tokens
- mean: 52.46 tokens
- max: 512 tokens
- min: 7 tokens
- mean: 51.96 tokens
- max: 512 tokens
- min: 7 tokens
- mean: 52.59 tokens
- max: 512 tokens
- Samples:
query positive negative_1 negative_2 negative_3 query: ู ุง ูู ุจุนุถ ุงูุฃููุงู ู ู ุฌู ูุน ุฃูุญุงุก ุงูุนุงูู ุงูุชู ุชุญุชูู ุนูู ู ุดุงูุฏ ุนุงุฑูุฉ (ุจุงุณุชุซูุงุก ุงูุฅุจุงุญูุฉ) ุpassage: ู ุง ูู ุจุนุถ ุงูุฃููุงู ู ู ุฌู ูุน ุฃูุญุงุก ุงูุนุงูู ุงูุชู ุชุญุชูู ุนูู ุจุนุถ ู ู ุฃูุถู ุงูู ุดุงูุฏ ุงูุนุงุฑูุฉ (ุจุงุณุชุซูุงุก ุงูุฅุจุงุญูุฉ) ุpassage: ู ุง ูู ุงููููู ุงูุฐู ูุญุชูู ุนูู ุฃูุซุฑ ุงูู ุดุงูุฏ ุงูุนุงุฑูุฉุpassage: ูู ู ู ู ุดุงูุฏ ุฌูุณ ูู ุงูุฃููุงู (ุบูุฑ ุงูุฅุจุงุญูุฉ) ูุงู ูููุง ุงูู ู ุซููู ูู ุงุฑุณูู ุงูุฌูุณุpassage: ู ุง ูู ุจุนุถ ู ู ุฃูุถู ุงูุฃููุงู ุงูุฌูุณูุฉ (ุฃู ูุบุฉ) ุquery: ู ุง ูู ุงูุฏู ุงุบ ุงูุจูููุpassage: Top 10 amazing movie makeup transformations. The diencephalon, also known as the interbrain or betweenbrain, is one of the major areas of the brain, along with the brainstem, cerebellum, and cerebrum.passage: These 10 animal facts will amaze you. The diencephalon, also known as the interbrain or betweenbrain, is one of the major areas of the brain, along with the brainstem, cerebellum, and cerebrum. This structure in the brain contains a number of smaller components of the brain which perform a variety of roles to keep the body functioning.passage: 1 The diencephalon is made up of four main components: the thalamus, the subthalamus, the hypothalamus, and the epithalamus. The hypothalamus is an integral part of the endocrine system, with one of the most important functions being to link the nervous system to the endocrine system via the pituitary gland.passage: Meronyms (parts of diencephalon): corpus mamillare; mamillary body; mammillary body (one of two small round structures on the undersurface of the brain that form the terminals of the anterior arches of the fornix) infundibulum (any of various funnel-shaped parts of the body (but especially the hypophyseal stalk))query: ูู ูุนุชุจุฑ ุงูุชูุงุจ ุงูู ูุงุตู ุงูุฑูู ุงุชููุฏู ููุณ ุงูุชูุงุจ ุงูู ูุงุตูpassage: Arthritis is an umbrella term used to describe pain, stiffness and inflammation of the joints. However, there are different kinds of arthritis, including rheumatoid arthritis (RA) and osteoarthritis (OA). Although RA and OA both affect the joints, they are very different forms of the same broader condition.Rheumatoid arthritis is an autoimmune condition, while osteoarthritis is a degenerative joint disease.lthough RA and OA both affect the joints, they are very different forms of the same broader condition. Rheumatoid arthritis is an autoimmune condition, while osteoarthritis is a degenerative joint disease.passage: Text A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining.This inflammation of the joint lining (called the synovium) can cause pain, stiffness, swelling, warmth, and redness.ext A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining.passage: Rheumatoid arthritis is a serious autoimmune disease that attacks the joints and other body parts. But RA can be tough to diagnose.Symptoms can mimic other illnesses, or they may flare, then fade, only to flare again somewhere else.Lab tests arenโt perfectyou can test negative for RA factors and still have it.heumatoid arthritis is a serious autoimmune disease that attacks the joints and other body parts. But RA can be tough to diagnose.passage: Share +. Text A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining. This inflammation of the joint lining (called the synovium) can cause pain, stiffness, swelling, warmth, and redness.ext A A A. Rheumatoid arthritis (RA) is an autoimmune disease where the body's immune system attacks normal joint tissues, causing inflammation of the joint lining. - Loss:
MatryoshkaLosswith these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1 ], "n_dims_per_step": -1 }
triplets
- Dataset: triplets
- Size: 371,861 training samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 18 tokens
- mean: 53.76 tokens
- max: 150 tokens
- min: 9 tokens
- mean: 55.48 tokens
- max: 159 tokens
- min: 5 tokens
- mean: 51.11 tokens
- max: 166 tokens
- Samples:
anchor positive negative query: ููุฏ ูุฏู ุงูุจุงุญุซ ู ูุณูุชูุทูุฑูููุง ู ู ุงูุฃุจุญุงุซ ูู ุงูู ุคุชู ุฑุ ุญูุซ ุนุฑุถ ูุชุงุฆุฌ ุบูุฑ ู ุฃูููุฉ ุฃุซุงุฑุช ุฌุฏูุงู ูุงุณุนุงู ุจูู ุงูุญุถูุฑ ุงูุจุงุฑุญุฉ."passage: "ููุฏ ูุฏู ุงูุจุงุญุซ ุฌุฏูุฏูุง ู ู ุงูุฏุฑุงุณุงุช ูู ุงูู ูุชููุ ุฅุฐ ุจูู ู ุนุทูุงุช ุบูุฑ ุชูููุฏูุฉ ุฃุซุงุฑุช ููุงุดูุง ู ุณุชููุถูุง ุจูู ุงูู ุดุงุฑููู ุจุงูุฃู ุณ."passage: "ููุฏ ูุฏู ุงูุจุงุญุซ ูุฏูู ูุง ู ู ุงูุฃุจุญุงุซ ูู ุงููุฏูุฉุ ุญูุซ ุนุฑุถ ู ุนููู ุงุช ู ุฃูููุฉ ูู ุชุซุฑ ุฃู ุญูุงุฑ ุจูู ุงูุญุงุถุฑูู ูุจู ููู .query: ุจุนุฏ ููู ุญุงุฑุ ุงุบุชุณูุช ุจู ุงุก ุฒููุงู ูุงู ูุฌุฑู ูู ุงูุฌุฏูู ุงูุตุบูุฑ ุจุฌุงูุจ ุงูุญูู.passage: ุจุนุฏ ููู ูุงุฆุธุ ุชุทูุฑุช ุจู ุงุก ููุฑูุงุช ูุงู ูุชุฏูู ูู ุงูููุฑ ุงูุถูู ูุฑุจ ุงูู ุฒุฑุนุฉ.passage: ุจุนุฏ ููู ู ุดู ุณุ ุชูุทุฎุช ุจู ุงุก ุฃูุฌููู ูุงู ุฑุงูุฏุง ูู ุงูุจุฑูุฉ ูุฑุจ ุงูู ุฑุนู.query: ุฃูููููุฏู ุงูู ูุฎููููู ูููู ููุงุฑู ุงูู ูุฎููู ู ุจูุญูู ูุงุณูุฉู ููุจููููู ุงููููููู ููุทููููู ุงูุนูุดูุงุกู."passage: "ุฃูุซูุงุฑู ุงูู ูุฎููููู ูููู ููููุจู ุงูู ูููููุฏู ุจูุดูุบููู ููุจููู ู ูุฌููุกู ุงูููููููู ููุฅูุนูุฏูุงุฏู ุงููุนูุดูุงุก."passage: "ุฃูุทูููุฃู ุงูู ูุฎููููู ูููู ููุงุฑู ุงูู ูุฎููู ู ุจูุนูุฏู ุงููุนูุดูุงุกู ููููุฎููููุฏู ุฅูููู ุงููููููู ู. - Loss:
MatryoshkaLosswith these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: epochper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 16learning_rate: 4e-05weight_decay: 0.01max_grad_norm: 2.0lr_scheduler_type: cosinewarmup_ratio: 0.1fp16: Trueoptim: adamw_8bitpush_to_hub: Truehub_model_id: TawasulAI/Faheem-mE5_Base_NLIhub_strategy: checkpoint
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 16eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 4e-05weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 2.0num_train_epochs: 3max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_8bitoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Trueresume_from_checkpoint: Nonehub_model_id: TawasulAI/Faheem-mE5_Base_NLIhub_strategy: checkpointhub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | validation_eval_cosine_accuracy |
|---|---|---|---|
| None | 0 | - | 0.9376 |
| 1.0 | 3374 | 22.5359 | 0.9636 |
| 2.0 | 6748 | 12.387 | 0.9669 |
| 3.0 | 10122 | 8.596 | 0.9671 |
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.7.0+cu128
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}