metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:21470
- loss:MultipleNegativesRankingLoss
base_model: thenlper/gte-small
widget:
- source_sentence: >-
This positive resistance model is a different way of analyzing feedback
oscillator operation.
sentences:
- >-
This positive resistance model is a different way of analyzing feedback
oscillator operation.
- >-
This negative resistance model is an alternate way of analyzing feedback
oscillator operation.
- >-
I am BE 8th sem. CSE student. Which path should I choose as a career or
which course I should do to get a good job in future within my country?
- source_sentence: >-
Danny Danny Kortchmar played guitar , Charles Larkey played bass and
Gordon played drums producing with Lou Adler .
sentences:
- What is the main reason for all the problems within India?
- >-
Gordon played guitar , Danny Kortchmar played bass and Lou Adler played
drums with Charles Larkey producing .
- >-
Danny Danny Kortchmar played guitar , Charles Larkey played bass and
Gordon played drums producing with Lou Adler .
- source_sentence: The Ngage isn't still lacking in earbuds.
sentences:
- >-
What is Queen's University's acceptance rate for international students
on campus?
- The Ngage is still lacking in earbuds.
- The Ngage isn't still lacking in earbuds.
- source_sentence: Previously reported figures were consistently revised down.
sentences:
- Previously reported figures were consistently revised down.
- >-
What are the side effects for using Proactiv on the face? How are the
side effects treated?
- Previously reported numbers were infrequently revised down.
- source_sentence: What is the fastest way to get a PAN card within India?
sentences:
- >-
He has also used the OpenMusic software (designed at IRCAM ) to create
computer-generated music.
- What is the fastest way to get a PAN card outside India?
- What is the fastest way to get a PAN card within India?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on thenlper/gte-small
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: NanoMSMARCO
type: NanoMSMARCO
metrics:
- type: cosine_accuracy@1
value: 0.28
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.48
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.52
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.58
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.28
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.15999999999999998
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.10400000000000001
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.057999999999999996
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.28
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.48
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.52
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.58
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4281391945817123
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.3795238095238095
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.39018847344323304
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: NanoNQ
type: NanoNQ
metrics:
- type: cosine_accuracy@1
value: 0.32
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.6
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.66
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.74
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.32
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.132
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.07400000000000001
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.3
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.55
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.61
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.68
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.5108521344166539
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.4791904761904762
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.452598225251627
name: Cosine Map@100
- task:
type: nano-beir
name: Nano BEIR
dataset:
name: NanoBEIR mean
type: NanoBEIR_mean
metrics:
- type: cosine_accuracy@1
value: 0.30000000000000004
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.54
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5900000000000001
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.6599999999999999
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.30000000000000004
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.18
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.11800000000000001
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.066
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.29000000000000004
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.515
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.565
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.63
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4694956644991831
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.4293571428571429
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.42139334934743
name: Cosine Map@100
SentenceTransformer based on thenlper/gte-small
This is a sentence-transformers model finetuned from thenlper/gte-small. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: thenlper/gte-small
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("redis/unified-negatives")
# Run inference
sentences = [
'What is the fastest way to get a PAN card within India?',
'What is the fastest way to get a PAN card within India?',
'What is the fastest way to get a PAN card outside India?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 1.0000, 0.2943],
# [1.0000, 1.0000, 0.2943],
# [0.2943, 0.2943, 1.0000]])
Evaluation
Metrics
Information Retrieval
- Datasets:
NanoMSMARCOandNanoNQ - Evaluated with
InformationRetrievalEvaluator
| Metric | NanoMSMARCO | NanoNQ |
|---|---|---|
| cosine_accuracy@1 | 0.28 | 0.32 |
| cosine_accuracy@3 | 0.48 | 0.6 |
| cosine_accuracy@5 | 0.52 | 0.66 |
| cosine_accuracy@10 | 0.58 | 0.74 |
| cosine_precision@1 | 0.28 | 0.32 |
| cosine_precision@3 | 0.16 | 0.2 |
| cosine_precision@5 | 0.104 | 0.132 |
| cosine_precision@10 | 0.058 | 0.074 |
| cosine_recall@1 | 0.28 | 0.3 |
| cosine_recall@3 | 0.48 | 0.55 |
| cosine_recall@5 | 0.52 | 0.61 |
| cosine_recall@10 | 0.58 | 0.68 |
| cosine_ndcg@10 | 0.4281 | 0.5109 |
| cosine_mrr@10 | 0.3795 | 0.4792 |
| cosine_map@100 | 0.3902 | 0.4526 |
Nano BEIR
- Dataset:
NanoBEIR_mean - Evaluated with
NanoBEIREvaluatorwith these parameters:{ "dataset_names": [ "msmarco", "nq" ], "dataset_id": "lightonai/NanoBEIR-en" }
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.3 |
| cosine_accuracy@3 | 0.54 |
| cosine_accuracy@5 | 0.59 |
| cosine_accuracy@10 | 0.66 |
| cosine_precision@1 | 0.3 |
| cosine_precision@3 | 0.18 |
| cosine_precision@5 | 0.118 |
| cosine_precision@10 | 0.066 |
| cosine_recall@1 | 0.29 |
| cosine_recall@3 | 0.515 |
| cosine_recall@5 | 0.565 |
| cosine_recall@10 | 0.63 |
| cosine_ndcg@10 | 0.4695 |
| cosine_mrr@10 | 0.4294 |
| cosine_map@100 | 0.4214 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 21,470 training samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 5 tokens
- mean: 19.91 tokens
- max: 101 tokens
- min: 5 tokens
- mean: 19.91 tokens
- max: 101 tokens
- min: 4 tokens
- mean: 19.91 tokens
- max: 101 tokens
- Samples:
anchor positive negative The pale coloration provides camouflage for the beetle on the light sand.The pale coloration provides camouflage for the beetle on the light sand.The pale coloration helps the beetle stand out on the light sand.It is found from Fennoscandinavia to the Pyrenees , Italy and Greece and from Britain to Russia and Ukraine .It is found from Fennoscandinavia to the Pyrenees , Italy and Greece and from Britain to Russia and Ukraine .It is located from Fennoscandinavia to the Pyrenees , Great Britain and Greece and from Italy to Russia and Ukraine .Is Swami Vivekananda's speech at parliament of world's religions, Chicago overrated in Chicago?Is Swami Vivekananda's speech at parliament of world's religions, Chicago overrated in Chicago?Is Swami Vivekananda's speech at parliament of world's religions, Chicago overrated outside Chicago? - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 7.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
Unnamed Dataset
- Size: 2,386 evaluation samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 6 tokens
- mean: 19.42 tokens
- max: 74 tokens
- min: 6 tokens
- mean: 19.42 tokens
- max: 74 tokens
- min: 6 tokens
- mean: 19.41 tokens
- max: 74 tokens
- Samples:
anchor positive negative He died at Fort Edward on August 18 , 1861 , and was buried at the Union Cemetery in Sandy Hill .He died at Fort Edward on August 18 , 1861 , and was buried at the Union Cemetery in Sandy Hill .He died at Sandy Hill on August 18 , 1861 , and was buried at the Union Cemetery in Fort Edward .It was this cooperation which led to the development of the satellite AIS system.It was this cooperation which led to the development of the satellite AIS system.It was this cooperation which led to the halting of development of the satellite AIS system.What is the best field of engineering on campus?What is the best field of engineering on campus?What is the best field of engineering off campus? - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 7.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128learning_rate: 1e-06weight_decay: 0.001max_steps: 3000warmup_ratio: 0.1fp16: Truedataloader_drop_last: Truedataloader_num_workers: 1dataloader_prefetch_factor: 1load_best_model_at_end: Trueoptim: adamw_torchddp_find_unused_parameters: Falsepush_to_hub: Truehub_model_id: redis/unified-negativeseval_on_start: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1e-06weight_decay: 0.001adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3.0max_steps: 3000lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Truedataloader_num_workers: 1dataloader_prefetch_factor: 1past_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Falseddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Trueresume_from_checkpoint: Nonehub_model_id: redis/unified-negativeshub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Trueuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
|---|---|---|---|---|---|---|
| 0 | 0 | - | 3.6734 | 0.6259 | 0.6583 | 0.6421 |
| 1.4970 | 250 | 3.8677 | 3.3900 | 0.6334 | 0.6510 | 0.6422 |
| 2.9940 | 500 | 3.188 | 1.8654 | 0.5772 | 0.6252 | 0.6012 |
| 4.4910 | 750 | 1.4714 | 0.6890 | 0.4032 | 0.5437 | 0.4735 |
| 5.9880 | 1000 | 0.8535 | 0.5511 | 0.3617 | 0.5197 | 0.4407 |
| 7.4850 | 1250 | 0.7547 | 0.5268 | 0.3469 | 0.5346 | 0.4407 |
| 8.9820 | 1500 | 0.716 | 0.5123 | 0.3684 | 0.5223 | 0.4454 |
| 10.4790 | 1750 | 0.6939 | 0.5039 | 0.3846 | 0.5179 | 0.4512 |
| 11.9760 | 2000 | 0.6789 | 0.4986 | 0.4120 | 0.5280 | 0.4700 |
| 13.4731 | 2250 | 0.6681 | 0.4953 | 0.4148 | 0.5189 | 0.4669 |
| 14.9701 | 2500 | 0.662 | 0.4918 | 0.4224 | 0.5109 | 0.4666 |
| 16.4671 | 2750 | 0.6575 | 0.4905 | 0.4224 | 0.5109 | 0.4666 |
| 17.9641 | 3000 | 0.6555 | 0.4900 | 0.4281 | 0.5109 | 0.4695 |
Framework Versions
- Python: 3.10.18
- Sentence Transformers: 5.2.0
- Transformers: 4.57.3
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 2.21.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}