SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Syldehayem/all-MiniLM-L6-v2_embedder")
# Run inference
sentences = [
    '**Award Winning** CGI 3D Animated Short: "Monsters In The Dark" - by Apollonia Thomaier | TheCGBros',
    'Gajamukta - Bengali Full Movie | Moon Moon Sen | Abhishek Chatterjee | Soumitra Chatterjee',
    'Nayantara | নয়নতারা | Family Movie | Full HD | Saswata Chatterjee, Soumitra, Mamata Shankar',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,712 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 4 tokens
    • mean: 19.73 tokens
    • max: 69 tokens
    • min: 3 tokens
    • mean: 20.14 tokens
    • max: 49 tokens
    • min: 3 tokens
    • mean: 20.23 tokens
    • max: 66 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    D.A.D. (Sci-Fi Short Film) Dad just got an upgrade Preservation Clip
    WATCH Unknown Caller Short Film LINK BELOW #shorts CGI VFX Short Spot : "Chalet" by - Counterfeit FX
    Pratibha প্রতিভা Bengali Romantic Movie
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 100
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 100
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss
0.8237 500 5.0006
1.6474 1000 4.9915
2.4712 1500 4.96
3.2949 2000 4.9266
4.1186 2500 4.8689
4.9423 3000 4.8158
5.7661 3500 4.7408
6.5898 4000 4.702
7.4135 4500 4.6564
8.2372 5000 4.63
9.0610 5500 4.6119
9.8847 6000 4.5983
0.8237 500 4.6071
1.6474 1000 4.6401
2.4712 1500 4.6525
3.2949 2000 4.6101
4.1186 2500 4.5926
4.9423 3000 4.5827
5.7661 3500 4.5096
6.5898 4000 4.5171
7.4135 4500 4.507
8.2372 5000 4.4738
9.0610 5500 4.4973
9.8847 6000 4.4485
0.8237 500 4.4222
1.6474 1000 4.3984
2.4712 1500 4.4144
3.2949 2000 4.4117
4.1186 2500 4.4042
4.9423 3000 4.4136
5.7661 3500 4.4055
6.5898 4000 4.4267
7.4135 4500 4.4548
8.2372 5000 4.4443
9.0610 5500 4.4649
9.8847 6000 4.4463
10.7084 6500 4.4771
11.5321 7000 4.4691
12.3558 7500 4.4817
13.1796 8000 4.4505
14.0033 8500 4.4355
14.8270 9000 4.4145
15.6507 9500 4.4128
16.4745 10000 4.3874
17.2982 10500 4.4057
18.1219 11000 4.3841
18.9456 11500 4.3836
19.7694 12000 4.3554
20.5931 12500 4.3445
21.4168 13000 4.3351
22.2405 13500 4.3602
23.0643 14000 4.3366
23.8880 14500 4.3302
24.7117 15000 4.3531
25.5354 15500 4.3002
26.3591 16000 4.3499
27.1829 16500 4.3049
28.0066 17000 4.3039
28.8303 17500 4.3045
29.6540 18000 4.2969
30.4778 18500 4.2831
31.3015 19000 4.2999
32.1252 19500 4.3037
32.9489 20000 4.2768
33.7727 20500 4.2928
34.5964 21000 4.2697
35.4201 21500 4.2985
36.2438 22000 4.2799
37.0675 22500 4.286
37.8913 23000 4.2671
38.7150 23500 4.2775
39.5387 24000 4.2872
40.3624 24500 4.2687
41.1862 25000 4.2555
42.0099 25500 4.2661
42.8336 26000 4.2737
43.6573 26500 4.2476
44.4811 27000 4.2347
45.3048 27500 4.2381
46.1285 28000 4.2533
46.9522 28500 4.2295
47.7759 29000 4.2346
48.5997 29500 4.2411
49.4234 30000 4.2347
50.2471 30500 4.232
51.0708 31000 4.2409
51.8946 31500 4.2219
52.7183 32000 4.2284
53.5420 32500 4.2396
54.3657 33000 4.2199
55.1895 33500 4.2198
56.0132 34000 4.1958
56.8369 34500 4.2034
57.6606 35000 4.1931
58.4843 35500 4.2292
59.3081 36000 4.197
60.1318 36500 4.2365
60.9555 37000 4.1939
61.7792 37500 4.2045
62.6030 38000 4.2037
63.4267 38500 4.2007
64.2504 39000 4.2025
65.0741 39500 4.1846
65.8979 40000 4.1812
66.7216 40500 4.2022
67.5453 41000 4.1955
68.3690 41500 4.1834
69.1928 42000 4.1838
70.0165 42500 4.1908
70.8402 43000 4.1821
71.6639 43500 4.1636
72.4876 44000 4.1868
73.3114 44500 4.1737
74.1351 45000 4.1802
74.9588 45500 4.1744
75.7825 46000 4.1688
76.6063 46500 4.1664
77.4300 47000 4.1627
78.2537 47500 4.1561
79.0774 48000 4.1699
79.9012 48500 4.1679
80.7249 49000 4.1579
81.5486 49500 4.1502
82.3723 50000 4.1613
83.1960 50500 4.1342
84.0198 51000 4.1659
84.8435 51500 4.1484
85.6672 52000 4.1563
86.4909 52500 4.1551
87.3147 53000 4.1519
88.1384 53500 4.1486
88.9621 54000 4.1532
89.7858 54500 4.1506
90.6096 55000 4.1397
91.4333 55500 4.1589
92.2570 56000 4.1213
93.0807 56500 4.1466
93.9044 57000 4.1496
94.7282 57500 4.1416
95.5519 58000 4.1427
96.3756 58500 4.133
97.1993 59000 4.1505
98.0231 59500 4.1342
98.8468 60000 4.133
99.6705 60500 4.151

Framework Versions

  • Python: 3.12.9
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu126
  • Accelerate: 1.6.0
  • Datasets: 3.5.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
-
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Syldehayem/all-MiniLM-L6-v2_embedder

Finetuned
(718)
this model

Papers for Syldehayem/all-MiniLM-L6-v2_embedder