metadata
base_model: intfloat/multilingual-e5-small
library_name: sentence-transformers
metrics:
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- dot_accuracy
- dot_accuracy_threshold
- dot_f1
- dot_f1_threshold
- dot_precision
- dot_recall
- dot_ap
- manhattan_accuracy
- manhattan_accuracy_threshold
- manhattan_f1
- manhattan_f1_threshold
- manhattan_precision
- manhattan_recall
- manhattan_ap
- euclidean_accuracy
- euclidean_accuracy_threshold
- euclidean_f1
- euclidean_f1_threshold
- euclidean_precision
- euclidean_recall
- euclidean_ap
- max_accuracy
- max_accuracy_threshold
- max_f1
- max_f1_threshold
- max_precision
- max_recall
- max_ap
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1936
- loss:OnlineContrastiveLoss
widget:
- source_sentence: What are the symptoms of COVID-19?
sentences:
- How to identify COVID-19?
- What is the process for booking a dinner table?
- >-
It is not necessary to include specific fields in a financial report;
nevertheless, it is beneficial to add pertinent financial metrics to
help investors gauge the company's condition.
- source_sentence: How to apply for a scholarship?
sentences:
- Steps to apply for a scholarship
- Advantages of practicing meditation
- >-
When `ignore_metadata` is set to `True`, all metadata and attributes are
stripped from the file prior to processing.
- source_sentence: How to write a novel?
sentences:
- How to write a short story?
- Who wrote 'Macbeth'?
- How to reset a phone
- source_sentence: >-
You can wrap the project in `job.utils.data.JobLoader` and create a
collate function to collate the tasks into batches.
sentences:
- Steps to prepare a steak
- How many people live in Germany?
- >-
You can use `job.utils.data.JobLoader` to encapsulate the project and
define a collate function to group the tasks into batches.
- source_sentence: What is the time now?
sentences:
- How to cook a chicken?
- Current time
- Guide to starting a small business
model-index:
- name: SentenceTransformer based on intfloat/multilingual-e5-small
results:
- task:
type: binary-classification
name: Binary Classification
dataset:
name: pair class dev
type: pair-class-dev
metrics:
- type: cosine_accuracy
value: 0.9212962962962963
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.8385236263275146
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.9403508771929825
name: Cosine F1
- type: cosine_f1_threshold
value: 0.8385236263275146
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.9370629370629371
name: Cosine Precision
- type: cosine_recall
value: 0.9436619718309859
name: Cosine Recall
- type: cosine_ap
value: 0.9872231100578164
name: Cosine Ap
- type: dot_accuracy
value: 0.9212962962962963
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 0.8385236263275146
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.9403508771929825
name: Dot F1
- type: dot_f1_threshold
value: 0.8385236263275146
name: Dot F1 Threshold
- type: dot_precision
value: 0.9370629370629371
name: Dot Precision
- type: dot_recall
value: 0.9436619718309859
name: Dot Recall
- type: dot_ap
value: 0.9872231100578164
name: Dot Ap
- type: manhattan_accuracy
value: 0.9166666666666666
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 8.658426284790039
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.9391891891891893
name: Manhattan F1
- type: manhattan_f1_threshold
value: 9.594137191772461
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.9025974025974026
name: Manhattan Precision
- type: manhattan_recall
value: 0.9788732394366197
name: Manhattan Recall
- type: manhattan_ap
value: 0.987218816132896
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.9212962962962963
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 0.568278431892395
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.9403508771929825
name: Euclidean F1
- type: euclidean_f1_threshold
value: 0.568278431892395
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.9370629370629371
name: Euclidean Precision
- type: euclidean_recall
value: 0.9436619718309859
name: Euclidean Recall
- type: euclidean_ap
value: 0.9872231100578164
name: Euclidean Ap
- type: max_accuracy
value: 0.9212962962962963
name: Max Accuracy
- type: max_accuracy_threshold
value: 8.658426284790039
name: Max Accuracy Threshold
- type: max_f1
value: 0.9403508771929825
name: Max F1
- type: max_f1_threshold
value: 9.594137191772461
name: Max F1 Threshold
- type: max_precision
value: 0.9370629370629371
name: Max Precision
- type: max_recall
value: 0.9788732394366197
name: Max Recall
- type: max_ap
value: 0.9872231100578164
name: Max Ap
- task:
type: binary-classification
name: Binary Classification
dataset:
name: pair class test
type: pair-class-test
metrics:
- type: cosine_accuracy
value: 0.9305555555555556
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.8569861650466919
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.9484536082474226
name: Cosine F1
- type: cosine_f1_threshold
value: 0.8531842827796936
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.9261744966442953
name: Cosine Precision
- type: cosine_recall
value: 0.971830985915493
name: Cosine Recall
- type: cosine_ap
value: 0.9898045699188958
name: Cosine Ap
- type: dot_accuracy
value: 0.9305555555555556
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 0.8569861650466919
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.9484536082474226
name: Dot F1
- type: dot_f1_threshold
value: 0.8531842231750488
name: Dot F1 Threshold
- type: dot_precision
value: 0.9261744966442953
name: Dot Precision
- type: dot_recall
value: 0.971830985915493
name: Dot Recall
- type: dot_ap
value: 0.9898045699188958
name: Dot Ap
- type: manhattan_accuracy
value: 0.9351851851851852
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 8.299823760986328
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.9517241379310345
name: Manhattan F1
- type: manhattan_f1_threshold
value: 8.299823760986328
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.9324324324324325
name: Manhattan Precision
- type: manhattan_recall
value: 0.971830985915493
name: Manhattan Recall
- type: manhattan_ap
value: 0.9895380844501982
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.9305555555555556
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 0.534814715385437
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.9484536082474226
name: Euclidean F1
- type: euclidean_f1_threshold
value: 0.5418605804443359
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.9261744966442953
name: Euclidean Precision
- type: euclidean_recall
value: 0.971830985915493
name: Euclidean Recall
- type: euclidean_ap
value: 0.9898045699188958
name: Euclidean Ap
- type: max_accuracy
value: 0.9351851851851852
name: Max Accuracy
- type: max_accuracy_threshold
value: 8.299823760986328
name: Max Accuracy Threshold
- type: max_f1
value: 0.9517241379310345
name: Max F1
- type: max_f1_threshold
value: 8.299823760986328
name: Max F1 Threshold
- type: max_precision
value: 0.9324324324324325
name: Max Precision
- type: max_recall
value: 0.971830985915493
name: Max Recall
- type: max_ap
value: 0.9898045699188958
name: Max Ap
SentenceTransformer based on intfloat/multilingual-e5-small
This is a sentence-transformers model finetuned from intfloat/multilingual-e5-small. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: intfloat/multilingual-e5-small
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("srikarvar/fine_tuned_model_11")
# Run inference
sentences = [
'What is the time now?',
'Current time',
'Guide to starting a small business',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Binary Classification
- Dataset:
pair-class-dev - Evaluated with
BinaryClassificationEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.9213 |
| cosine_accuracy_threshold | 0.8385 |
| cosine_f1 | 0.9404 |
| cosine_f1_threshold | 0.8385 |
| cosine_precision | 0.9371 |
| cosine_recall | 0.9437 |
| cosine_ap | 0.9872 |
| dot_accuracy | 0.9213 |
| dot_accuracy_threshold | 0.8385 |
| dot_f1 | 0.9404 |
| dot_f1_threshold | 0.8385 |
| dot_precision | 0.9371 |
| dot_recall | 0.9437 |
| dot_ap | 0.9872 |
| manhattan_accuracy | 0.9167 |
| manhattan_accuracy_threshold | 8.6584 |
| manhattan_f1 | 0.9392 |
| manhattan_f1_threshold | 9.5941 |
| manhattan_precision | 0.9026 |
| manhattan_recall | 0.9789 |
| manhattan_ap | 0.9872 |
| euclidean_accuracy | 0.9213 |
| euclidean_accuracy_threshold | 0.5683 |
| euclidean_f1 | 0.9404 |
| euclidean_f1_threshold | 0.5683 |
| euclidean_precision | 0.9371 |
| euclidean_recall | 0.9437 |
| euclidean_ap | 0.9872 |
| max_accuracy | 0.9213 |
| max_accuracy_threshold | 8.6584 |
| max_f1 | 0.9404 |
| max_f1_threshold | 9.5941 |
| max_precision | 0.9371 |
| max_recall | 0.9789 |
| max_ap | 0.9872 |
Binary Classification
- Dataset:
pair-class-test - Evaluated with
BinaryClassificationEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.9306 |
| cosine_accuracy_threshold | 0.857 |
| cosine_f1 | 0.9485 |
| cosine_f1_threshold | 0.8532 |
| cosine_precision | 0.9262 |
| cosine_recall | 0.9718 |
| cosine_ap | 0.9898 |
| dot_accuracy | 0.9306 |
| dot_accuracy_threshold | 0.857 |
| dot_f1 | 0.9485 |
| dot_f1_threshold | 0.8532 |
| dot_precision | 0.9262 |
| dot_recall | 0.9718 |
| dot_ap | 0.9898 |
| manhattan_accuracy | 0.9352 |
| manhattan_accuracy_threshold | 8.2998 |
| manhattan_f1 | 0.9517 |
| manhattan_f1_threshold | 8.2998 |
| manhattan_precision | 0.9324 |
| manhattan_recall | 0.9718 |
| manhattan_ap | 0.9895 |
| euclidean_accuracy | 0.9306 |
| euclidean_accuracy_threshold | 0.5348 |
| euclidean_f1 | 0.9485 |
| euclidean_f1_threshold | 0.5419 |
| euclidean_precision | 0.9262 |
| euclidean_recall | 0.9718 |
| euclidean_ap | 0.9898 |
| max_accuracy | 0.9352 |
| max_accuracy_threshold | 8.2998 |
| max_f1 | 0.9517 |
| max_f1_threshold | 8.2998 |
| max_precision | 0.9324 |
| max_recall | 0.9718 |
| max_ap | 0.9898 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,936 training samples
- Columns:
label,sentence1, andsentence2 - Approximate statistics based on the first 1000 samples:
label sentence1 sentence2 type int string string details - 0: ~35.30%
- 1: ~64.70%
- min: 6 tokens
- mean: 16.19 tokens
- max: 98 tokens
- min: 4 tokens
- mean: 15.75 tokens
- max: 98 tokens
- Samples:
label sentence1 sentence2 1How do I apply for a credit card?How do I get a credit card?1What is the function of a learning rate scheduler?How does a learning rate scheduler optimize training?0What is the speed of a rocket?What is the speed of a jet plane? - Loss:
OnlineContrastiveLoss
Evaluation Dataset
Unnamed Dataset
- Size: 216 evaluation samples
- Columns:
label,sentence1, andsentence2 - Approximate statistics based on the first 216 samples:
label sentence1 sentence2 type int string string details - 0: ~34.26%
- 1: ~65.74%
- min: 6 tokens
- mean: 15.87 tokens
- max: 87 tokens
- min: 4 tokens
- mean: 15.61 tokens
- max: 86 tokens
- Samples:
label sentence1 sentence2 0What is the freezing point of ethanol?What is the boiling point of ethanol?0Healthy habitsUnhealthy habits0What is the difference between omnivores and herbivores?What is the difference between omnivores, carnivores, and herbivores? - Loss:
OnlineContrastiveLoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: epochper_device_train_batch_size: 32per_device_eval_batch_size: 32gradient_accumulation_steps: 2num_train_epochs: 4warmup_ratio: 0.1load_best_model_at_end: Trueoptim: adamw_torch_fusedbatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 2eval_accumulation_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | loss | pair-class-dev_max_ap | pair-class-test_max_ap |
|---|---|---|---|---|---|
| 0 | 0 | - | - | 0.8705 | - |
| 0.3279 | 10 | 1.3831 | - | - | - |
| 0.6557 | 20 | 0.749 | - | - | - |
| 0.9836 | 30 | 0.5578 | 0.2991 | 0.9862 | - |
| 1.3115 | 40 | 0.3577 | - | - | - |
| 1.6393 | 50 | 0.2594 | - | - | - |
| 1.9672 | 60 | 0.2119 | - | - | - |
| 2.0 | 61 | - | 0.2753 | 0.9898 | - |
| 2.2951 | 70 | 0.17 | - | - | - |
| 2.6230 | 80 | 0.1126 | - | - | - |
| 2.9508 | 90 | 0.0538 | - | - | - |
| 2.9836 | 91 | - | 0.3222 | 0.9864 | - |
| 3.2787 | 100 | 0.1423 | - | - | - |
| 3.6066 | 110 | 0.066 | - | - | - |
| 3.9344 | 120 | 0.0486 | 0.3237 | 0.9872 | 0.9898 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.0
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 0.34.2
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}