IshTale's picture
Add new SentenceTransformer model
d1ff998 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:269337
  - loss:CoSENTLoss
base_model: intfloat/multilingual-e5-large
widget:
  - source_sentence: motion-activated security light with adjustable settings
    sentences:
      - >-
        LED Black Motion Sensor 2-Light Bullet Flood Light- 3000K Adjustable
        Dual Head Outdoor Security Light, Dusk to Dawn, Waterproof, Hardwired
        Spotlight for Yard, Patio, Garage, Landscape
      - Waterpik Cordless Advanced Water Flosser
      - >-
        Tabi Ballet Flats Shoes for Women Rounde Toe Wide Width Split Toe Low
        Heel Comfortable Flats Shoes
  - source_sentence: microdevice for line smoothing
    sentences:
      - SkinMedica TNS Advanced+ Serum
      - >-
        Waterproof Beach Bag for Women with Phone Pouch, Large Tote Bag for
        Pool, Travel and Vacation
      - Fisher-Price 4-in-1 Step 'n Play Piano
  - source_sentence: hair strengthening serum
    sentences:
      - >-
        Yaheetech Adjustable Dumbbell Set Free Weight Dumbbells
        40lbs/52.5lbs/90lbs Fast Adjust Dumbbells Dumbbell Weight Set, with Tray
        for Men/Women Strength Training Equipment
      - DeLonghi Dedica Arte Espresso Machine
      - Opalescence Go Teeth Whitening Trays
  - source_sentence: slime making kit with glue and additives
    sentences:
      - Faber-Castell Polychromos Color Pencils Set of 120
      - >-
        Keter Delivery Box for Porch with Lockable Secure Storage Compartment to
        Keep Packages Safe, One Size, Brown
      - Stillman & Birn Zeta Series Sketchbook
  - source_sentence: antioxidant serum for skin protection
    sentences:
      - Louisville Ladder 16-foot Fiberglass Extension Ladder
      - Crayola Light Up Tracing Pad
      - Logitech MX Master 3S Wireless Mouse
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on intfloat/multilingual-e5-large

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-large
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("IshTale/MultiEccomerceEmbeddingModel")
# Run inference
sentences = [
    'antioxidant serum for skin protection',
    'Louisville Ladder 16-foot Fiberglass Extension Ladder',
    'Logitech MX Master 3S Wireless Mouse',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.4999, 0.4880],
#         [0.4999, 1.0000, 0.6445],
#         [0.4880, 0.6445, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 269,337 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 5 tokens
    • mean: 11.2 tokens
    • max: 23 tokens
    • min: 3 tokens
    • mean: 24.29 tokens
    • max: 66 tokens
    • min: -1.0
    • mean: 0.05
    • max: 0.99
  • Samples:
    sentence_0 sentence_1 label
    motorized Nerf blaster with dinosaur theme B. Toys by Battat Wooden Activity Cube -0.07861651138439901
    smart mirror with adjustable lighting Pfaff Passport 2.0 Sewing Machine -0.835469516572358
    black tea with orange rind and spices Valrhona Cocoa Powder -0.13135949520666002
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0594 500 5.6346
0.1188 1000 5.5107
0.1782 1500 5.4706
0.2376 2000 5.4402
0.2970 2500 5.4039
0.3564 3000 5.4252
0.4158 3500 5.3693
0.4752 4000 5.3776
0.5346 4500 5.3672
0.5940 5000 5.4059
0.6534 5500 5.336
0.7128 6000 5.3467
0.7722 6500 5.3086

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.1
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}