Syldehayem's picture
Add new SentenceTransformer model
7b6614e verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:9712
  - loss:TripletLoss
base_model: Syldehayem/all-MiniLM-L12-v2_embedder_train
widget:
  - source_sentence: CGI 3D Animated Short "The Scarf" - by Team The Scarf
    sentences:
      - >-
        CGI 3D Short: "Lenovo Legion: Turning Point" - by Audis Huang &
        Moonshine Animation | TheCGBros
      - 'CGI Animated Trailers : "Dropzone" - by RealtimeUK'
      - 'CGI 3D Animated Short: "SOLVIVAL" - by Pixelhunters | TheCGBros'
  - source_sentence: >-
      CGI Animated Short Film HD "Terazia's Zoo " by Alison Dulou & Estelle
      Lefebvre | CGMeetup
    sentences:
      - A comedian puppet decides to branch out on his own / You're The Puppet
      - Horror Short Film Series “The Outer Darkness” Part 1 | ALTER
      - ERNIE | Omeleto
  - source_sentence: >-
      Kenneth Branagh in the thriller "Schneider's 2nd Stage" - Short film by
      Phil Stoole
    sentences:
      - 'CGI 3D Animated Short Film: "Fish in LOVE" by ISArt Digital |  @CGMeetup'
      - Cookies By The Fire Short Horror Film | Screamfest | Merry Christmas
      - >-
        CGI 3D Animated Spot: "Mantse Palm Wine" - by Arnold Bannerman |
        TheCGBros
  - source_sentence: The Portrait
    sentences:
      - >-
        A teenage girl must quickly adapt to a radically different urban
        environment | Barrio Frontera
      - Queen of Meatloaf | Short film tease
      - 'CGI 3D Tutorial : "Using Zapplink in Zbrush" - by 3dmotive'
  - source_sentence: Horror Short Film "Nice to Finally Meet You" | ALTER | Online Premiere
    sentences:
      - 'Mondays: The Spielberg Challenge Winner!'
      - 'The Curse of Pandora''s Box Returns to #UniversalHHN 2021'
      - SONS OF APRIL | Omeleto
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on Syldehayem/all-MiniLM-L12-v2_embedder_train

This is a sentence-transformers model finetuned from Syldehayem/all-MiniLM-L12-v2_embedder_train. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Syldehayem/all-MiniLM-L12-v2_embedder_train")
# Run inference
sentences = [
    'Horror Short Film "Nice to Finally Meet You" | ALTER | Online Premiere',
    "The Curse of Pandora's Box Returns to #UniversalHHN 2021",
    'Mondays: The Spielberg Challenge Winner!',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 9,712 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 3 tokens
    • mean: 19.7 tokens
    • max: 49 tokens
    • min: 3 tokens
    • mean: 19.91 tokens
    • max: 49 tokens
    • min: 4 tokens
    • mean: 20.27 tokens
    • max: 50 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    মেয়ে যখন মায়ের মতন Bidhilipi #Shorts
    A Sci-Fi Short Film: "Voltok" - by Jonathan Vleeschower TheCGBros CGI MoCap Demo : "Finger Mocap Without Any Post Animation" by the MocapLab
    LEAKY PIPES Taking care of a baby at 15 "Fifteen" - Short film by Sameh Alaa
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 50
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 50
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.8237 500 5.0075
1.6474 1000 4.9816
2.4712 1500 5.013
3.2949 2000 4.981
4.1186 2500 4.9981
4.9423 3000 4.9727
5.7661 3500 4.9698
6.5898 4000 4.9839
7.4135 4500 5.0001
8.2372 5000 4.9996
9.0610 5500 4.9993
9.8847 6000 4.9999
10.7084 6500 5.0015
11.5321 7000 4.9934
12.3558 7500 4.9903
13.1796 8000 4.9875
14.0033 8500 5.0018
14.8270 9000 5.0088
15.6507 9500 4.9643
16.4745 10000 4.9447
17.2982 10500 4.8911
18.1219 11000 4.8719
18.9456 11500 4.8671
19.7694 12000 4.8268
20.5931 12500 4.8195
21.4168 13000 4.7726
22.2405 13500 4.7479
23.0643 14000 4.7465
23.8880 14500 4.7776
24.7117 15000 4.7366
25.5354 15500 4.7076
26.3591 16000 4.74
27.1829 16500 4.7118
28.0066 17000 4.6797
28.8303 17500 4.7144
29.6540 18000 4.662
30.4778 18500 4.6849
31.3015 19000 4.6608
32.1252 19500 4.6844
32.9489 20000 4.6561
33.7727 20500 4.6513
34.5964 21000 4.6418
35.4201 21500 4.635
36.2438 22000 4.6418
37.0675 22500 4.62
37.8913 23000 4.615
38.7150 23500 4.6189
39.5387 24000 4.6113
40.3624 24500 4.6054
41.1862 25000 4.5824
42.0099 25500 4.5907
42.8336 26000 4.5949
43.6573 26500 4.5769
44.4811 27000 4.5758
45.3048 27500 4.5613
46.1285 28000 4.5816
46.9522 28500 4.5538
47.7759 29000 4.5645
48.5997 29500 4.5653
49.4234 30000 4.5494

Framework Versions

  • Python: 3.12.9
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.7.0+cu126
  • Accelerate: 1.6.0
  • Datasets: 3.5.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}