--- language: - en license: apache-2.0 tags: - sentence-transformers - sparse-encoder - sparse - splade - e-commerce - product-search - information-retrieval - multi-domain - dataset_size:99712 - loss:SpladeLoss - loss:SparseMultipleNegativesRankingLoss - loss:FlopsLoss base_model: distilbert/distilbert-base-uncased datasets: - tasksource/esci - wayfair/wands widget: - text: '[KIDS TOYLAND] Wooden Dessert Play Set for Kids, Pretend Play Food Sets for Birthday Party ,Great for 3, 4, 5, and 6 Year Olds Girls and Boys Wooden Pretend Play Food Desserts Set,Wood Dessert Tower and Cakes,Educational Play Food Toys for 2 years old kids Birthday Gift

Packing Includ:
cake stand *1 chocolates and cakes*12

Pretend Play Wooden Food Set Features:
This high-quality wooden toy is designed for kids three and up, can be used as educational toys for shape matching, counting and concepts of reconstruction.

1. size: 9.17*9.17*2.2 inch, this beautifully decorated multi shaped c' - text: mathematical compass - text: '[NYX PROFESSIONAL MAKEUP] NYX PROFESSIONAL MAKEUP Lip Lingerie Matte Liquid Lipstick - Beauty Mark, Chocolate Brown' - text: '[Aladdin] Mrs. Frisby and the Rats of NIMH' - text: '[Office Chairs] ginata salon beauty drafting chair' pipeline_tag: feature-extraction library_name: sentence-transformers --- # SPLADE Multi-Domain E-Commerce Search A SPLADE sparse encoder fine-tuned on multiple e-commerce datasets (Amazon ESCI + Wayfair WANDS + Home Depot) for better cross-domain generalization. Trades slight in-domain performance for significantly better generalization across e-commerce domains. ## Benchmark Results ### Cross-Domain Performance (vs Single-Domain Model) | Dataset | Single-Domain | **Multi-Domain** | Improvement | |---------|---------------|------------------|-------------| | ESCI (in-domain) | 0.389 | 0.372 | -4% | | WANDS (Wayfair) | 0.355 | **0.366** | +3% | | Home Depot | 0.384 | **0.410** | +7% | ### vs BM25 Baseline | Dataset | BM25 | **This Model** | Improvement | |---------|------|----------------|-------------| | ESCI | 0.305 | 0.372 | +22% | | WANDS | 0.329 | 0.366 | +11% | | Home Depot | 0.349 | 0.410 | +17% | ## Model Description This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval. ## Model Details ### Model Description - **Model Type:** SPLADE Sparse Encoder - **Base model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 30522 dimensions - **Similarity Function:** Dot Product ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers) - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder) ### Full Model Architecture ``` SparseEncoder( (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'DistilBertForMaskedLM'}) (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SparseEncoder # Download from the 🤗 Hub model = SparseEncoder("sparse_encoder_model_id") # Run inference sentences = [ 'mpow', '[Mpow] Wireless Earbuds Active Noise Cancelling, Mpow X3 ANC Bluetooth Earphones w/4 Mics Noise Cancelling, Stereo Earbuds w/Deep Bass, 30Hrs ANC Earbuds w/USB-C Charge, Smart Touch Control, IPX8 Waterproof', '[Jerzees] Jerzees Dri-Power Poly Pocketed Open-Bottom Sweatpants, Large - Black 100% Polyester Pre-shrunk Jersey', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 30522] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[ 69.1663, 66.0022, 51.6937], # [ 66.0022, 238.3157, 60.5486], # [ 51.6937, 60.5486, 174.3004]]) ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 99,712 training samples * Columns: anchor and positive * Approximate statistics based on the first 1000 samples: | | anchor | positive | |:--------|:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:---------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | bird feeder pole station | [EXCMARK] EXCMARK 2 Pack Shepherd Hook 32 inch 1/2 inch Thick Use at Weddings, Hanging Solar Lights, Lanterns, Bird Feeders, Metal Hanger Hook (Bronze, 32 inch)

Create the garden of your dreams with our Shepherds Hooks!

These amazing hooks with the perfect balance of tradition and versatility are the perfect accessory to any outdoor space! A super easy and convenient way to tackle any outdoor gardening party or event! It will make any hanging object stand out with ultimate beauty. Hang your decorative lights, bird feeders, lanterns, and more!

Each hook includes 2 extenders for three height options. The hooks can measure up to 32” | | chrome bath lighting | Progress Lighting Archie Collection 2-Light Chrome Bath Light Archie is a standout in any room and provides a fun and fashionable way to light your home. The authentic, prismatic style glass shade diffuses light to provide functional and stylish illumination. This fixture can be installed with the glass facing up or down to suit your preference.California residents: see Proposition 65 informationChrome finishClear prismatic glass17 in. W x 8-3/4 in. HUses (2) 100-Watt medium base bulbs (not included)Fixture can be installed facing upwards or downwards | | sex toys kinky for female | [Knaughty Knickers] Knaughty Knickers Daddys Little Lil Fuck Toy Fucktoy DDLG BDSM Owned Boyshort Black 95% combed and ringspun cotton/5% spandex --- Low rise shortie boyshort style panty --- Satin trim fold over elastic waistband --- Custom embelished on quality Bella product --- Super soft and comfortable --- Funny or rude underwear | * Loss: [SpladeLoss](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters: ```json { "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score', gather_across_devices=False)", "document_regularizer_weight": 3e-05, "query_regularizer_weight": 5e-05 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 32 - `learning_rate`: 2e-05 - `num_train_epochs`: 1 - `warmup_ratio`: 0.1 - `fp16`: True - `batch_sampler`: no_duplicates - `router_mapping`: {'anchor': 'query', 'positive': 'document'} #### All Hyperparameters

Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `parallelism_config`: None - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `project`: huggingface - `trackio_space_id`: trackio - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `hub_revision`: None - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: no - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `liger_kernel_config`: None - `eval_use_gather_object`: False - `average_tokens_across_devices`: True - `prompts`: None - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional - `router_mapping`: {'anchor': 'query', 'positive': 'document'} - `learning_rate_mapping`: {}
### Training Logs | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 0.0321 | 100 | 329.7303 | | 0.0642 | 200 | 1.9189 | | 0.0963 | 300 | 0.4059 | | 0.1284 | 400 | 0.3173 | | 0.1605 | 500 | 0.2776 | | 0.1926 | 600 | 0.2812 | | 0.2246 | 700 | 0.2648 | | 0.2567 | 800 | 0.2821 | | 0.2888 | 900 | 0.254 | | 0.3209 | 1000 | 0.2789 | | 0.3530 | 1100 | 0.2163 | | 0.3851 | 1200 | 0.2375 | | 0.4172 | 1300 | 0.2165 | | 0.4493 | 1400 | 0.2254 | | 0.4814 | 1500 | 0.2105 | | 0.5135 | 1600 | 0.2147 | | 0.5456 | 1700 | 0.2468 | | 0.5777 | 1800 | 0.2438 | | 0.6098 | 1900 | 0.209 | | 0.6418 | 2000 | 0.2327 | | 0.6739 | 2100 | 0.2475 | | 0.7060 | 2200 | 0.227 | | 0.7381 | 2300 | 0.1992 | | 0.7702 | 2400 | 0.2258 | | 0.8023 | 2500 | 0.1676 | | 0.8344 | 2600 | 0.2081 | | 0.8665 | 2700 | 0.1966 | | 0.8986 | 2800 | 0.218 | | 0.9307 | 2900 | 0.1998 | | 0.9628 | 3000 | 0.2157 | | 0.9949 | 3100 | 0.2011 | ### Framework Versions - Python: 3.11.10 - Sentence Transformers: 5.2.0 - Transformers: 4.57.3 - PyTorch: 2.9.1+cu128 - Accelerate: 1.12.0 - Datasets: 4.4.1 - Tokenizers: 0.22.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### SpladeLoss ```bibtex @misc{formal2022distillationhardnegativesampling, title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective}, author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant}, year={2022}, eprint={2205.04733}, archivePrefix={arXiv}, primaryClass={cs.IR}, url={https://arxiv.org/abs/2205.04733}, } ``` #### SparseMultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` #### FlopsLoss ```bibtex @article{paria2020minimizing, title={Minimizing flops to learn efficient sparse representations}, author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s}, journal={arXiv preprint arXiv:2004.05665}, year={2020} } ```