all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("harcor/all-MiniLM-L6-v2-electrical")
# Run inference
sentences = [
    'How can I solve a rubic cube 3x3x3?',
    'How can one solve Rubik’s cube 3×3×3?',
    "How do you solve a Rubik's Cube?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9808

Training Details

Training Dataset

Unnamed Dataset

  • Size: 100,029 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 14.15 tokens
    • max: 42 tokens
    • min: 5 tokens
    • mean: 13.57 tokens
    • max: 44 tokens
    • min: 4 tokens
    • mean: 14.63 tokens
    • max: 64 tokens
  • Samples:
    anchor positive negative
    Terminal Block, 230 A, 600 V, 3.43 in. H, 1.02 in. W, Gray Terminal Block 230 A Fuse Cap, 100 A
    Terminal Block, 230 A, 600 V, 3.43 in. H, 1.02 in. W, Gray Terminal Block 230 A terminal blocks Fuse Cap, 100 A
    Terminal Block, 230 A, 600 V, 3.43 in. H, 1.02 in. W, Gray Terminal Block 230 A pass through terminal blocks Fuse Cap, 100 A
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 100,000 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 13.85 tokens
    • max: 42 tokens
    • min: 6 tokens
    • mean: 13.65 tokens
    • max: 44 tokens
    • min: 4 tokens
    • mean: 14.76 tokens
    • max: 64 tokens
  • Samples:
    anchor positive negative
    Why in India do we not have one on one political debate as in USA? Why cant we have a public debate between politicians in India like the one in US? Can people on Quora stop India Pakistan debate? We are sick and tired seeing this everyday in bulk?
    What is OnePlus One? How is oneplus one? Why is OnePlus One so good?
    Does our mind control our emotions? How do smart and successful people control their emotions? How can I control my positive emotions for the people whom I love but they don't care about me?
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss all-MiniLM-L6-v2-electrical_cosine_accuracy
-1 -1 - 0.9808
0.0160 100 4.7269 -
0.0320 200 4.686 -
0.0480 300 4.6204 -
0.0640 400 4.5606 -
0.0800 500 4.5176 -
0.0960 600 4.4588 -
0.1120 700 4.417 -
0.1280 800 4.4144 -
0.1440 900 4.4004 -
0.1599 1000 4.3835 -
0.1759 1100 4.379 -
0.1919 1200 4.3828 -
0.2079 1300 4.3581 -
0.2239 1400 4.3502 -
0.2399 1500 4.3155 -
0.2559 1600 4.3204 -
0.2719 1700 4.3403 -
0.2879 1800 4.3195 -
0.3039 1900 4.2989 -
0.3199 2000 4.2871 -
0.3359 2100 4.2939 -
0.3519 2200 4.2906 -
0.3679 2300 4.2729 -
0.3839 2400 4.2765 -
0.3999 2500 4.2642 -
0.4159 2600 4.2629 -
0.4319 2700 4.276 -
0.4479 2800 4.276 -
0.4639 2900 4.2204 -
0.4798 3000 4.2556 -
0.4958 3100 4.2484 -
0.5118 3200 4.2004 -
0.5278 3300 4.2181 -
0.5438 3400 4.2097 -
0.5598 3500 4.2107 -
0.5758 3600 4.1949 -
0.5918 3700 4.2378 -
0.6078 3800 4.2098 -
0.6238 3900 4.196 -
0.6398 4000 4.1635 -
0.6558 4100 4.1946 -
0.6718 4200 4.1993 -
0.6878 4300 4.1971 -
0.7038 4400 4.2104 -
0.7198 4500 4.2174 -
0.7358 4600 4.1854 -
0.7518 4700 4.1834 -
0.7678 4800 4.1829 -
0.7837 4900 4.1831 -
0.7997 5000 4.1927 -
0.8157 5100 4.1746 -
0.8317 5200 4.1477 -
0.8477 5300 4.1748 -
0.8637 5400 4.1713 -
0.8797 5500 4.1313 -
0.8957 5600 4.1529 -
0.9117 5700 4.2078 -
0.9277 5800 4.1546 -
0.9437 5900 4.1684 -
0.9597 6000 4.1594 -
0.9757 6100 4.1426 -
0.9917 6200 4.1299 -

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.7.1+cu118
  • Accelerate: 1.8.1
  • Datasets: 3.6.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
4
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for harcor/all-MiniLM-L6-v2-electrical

Finetuned
(767)
this model

Papers for harcor/all-MiniLM-L6-v2-electrical

Evaluation results