SPLADE Multi-Domain E-Commerce Search
A SPLADE sparse encoder fine-tuned on multiple e-commerce datasets (Amazon ESCI + Wayfair WANDS + Home Depot) for better cross-domain generalization. Trades slight in-domain performance for significantly better generalization across e-commerce domains.
Benchmark Results
Cross-Domain Performance (vs Single-Domain Model)
| Dataset | Single-Domain | Multi-Domain | Improvement |
|---|---|---|---|
| ESCI (in-domain) | 0.389 | 0.372 | -4% |
| WANDS (Wayfair) | 0.355 | 0.366 | +3% |
| Home Depot | 0.384 | 0.410 | +7% |
vs BM25 Baseline
| Dataset | BM25 | This Model | Improvement |
|---|---|---|---|
| ESCI | 0.305 | 0.372 | +22% |
| WANDS | 0.329 | 0.366 | +11% |
| Home Depot | 0.349 | 0.410 | +17% |
Model Description
This is a SPLADE Sparse Encoder model finetuned from distilbert/distilbert-base-uncased using the sentence-transformers library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
Model Details
Model Description
- Model Type: SPLADE Sparse Encoder
- Base model: distilbert/distilbert-base-uncased
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 30522 dimensions
- Similarity Function: Dot Product
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Sparse Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sparse Encoders on Hugging Face
Full Model Architecture
SparseEncoder(
(0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'DistilBertForMaskedLM'})
(1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("sparse_encoder_model_id")
# Run inference
sentences = [
'mpow',
'[Mpow] Wireless Earbuds Active Noise Cancelling, Mpow X3 ANC Bluetooth Earphones w/4 Mics Noise Cancelling, Stereo Earbuds w/Deep Bass, 30Hrs ANC Earbuds w/USB-C Charge, Smart Touch Control, IPX8 Waterproof',
'[Jerzees] Jerzees Dri-Power Poly Pocketed Open-Bottom Sweatpants, Large - Black 100% Polyester Pre-shrunk Jersey',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 30522]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 69.1663, 66.0022, 51.6937],
# [ 66.0022, 238.3157, 60.5486],
# [ 51.6937, 60.5486, 174.3004]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 99,712 training samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 3 tokens
- mean: 6.2 tokens
- max: 22 tokens
- min: 4 tokens
- mean: 99.84 tokens
- max: 494 tokens
- Samples:
anchor positive bird feeder pole station[EXCMARK] EXCMARK 2 Pack Shepherd Hook 32 inch 1/2 inch Thick Use at Weddings, Hanging Solar Lights, Lanterns, Bird Feeders, Metal Hanger Hook (Bronze, 32 inch)Create the garden of your dreams with our Shepherds Hooks!
These amazing hooks with the perfect balance of tradition and versatility are the perfect accessory to any outdoor space! A super easy and convenient way to tackle any outdoor gardening party or event! It will make any hanging object stand out with ultimate beauty. Hang your decorative lights, bird feeders, lanterns, and more!
Each hook includes 2 extenders for three height options. The hooks can measure up to 32”chrome bath lightingProgress Lighting Archie Collection 2-Light Chrome Bath Light Archie is a standout in any room and provides a fun and fashionable way to light your home. The authentic, prismatic style glass shade diffuses light to provide functional and stylish illumination. This fixture can be installed with the glass facing up or down to suit your preference.California residents: see Proposition 65 informationChrome finishClear prismatic glass17 in. W x 8-3/4 in. HUses (2) 100-Watt medium base bulbs (not included)Fixture can be installed facing upwards or downwardssex toys kinky for female[Knaughty Knickers] Knaughty Knickers Daddys Little Lil Fuck Toy Fucktoy DDLG BDSM Owned Boyshort Black 95% combed and ringspun cotton/5% spandex --- Low rise shortie boyshort style panty --- Satin trim fold over elastic waistband --- Custom embelished on quality Bella product --- Super soft and comfortable --- Funny or rude underwear - Loss:
SpladeLosswith these parameters:{ "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score', gather_across_devices=False)", "document_regularizer_weight": 3e-05, "query_regularizer_weight": 5e-05 }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 32learning_rate: 2e-05num_train_epochs: 1warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicatesrouter_mapping: {'anchor': 'query', 'positive': 'document'}
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {'anchor': 'query', 'positive': 'document'}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 0.0321 | 100 | 329.7303 |
| 0.0642 | 200 | 1.9189 |
| 0.0963 | 300 | 0.4059 |
| 0.1284 | 400 | 0.3173 |
| 0.1605 | 500 | 0.2776 |
| 0.1926 | 600 | 0.2812 |
| 0.2246 | 700 | 0.2648 |
| 0.2567 | 800 | 0.2821 |
| 0.2888 | 900 | 0.254 |
| 0.3209 | 1000 | 0.2789 |
| 0.3530 | 1100 | 0.2163 |
| 0.3851 | 1200 | 0.2375 |
| 0.4172 | 1300 | 0.2165 |
| 0.4493 | 1400 | 0.2254 |
| 0.4814 | 1500 | 0.2105 |
| 0.5135 | 1600 | 0.2147 |
| 0.5456 | 1700 | 0.2468 |
| 0.5777 | 1800 | 0.2438 |
| 0.6098 | 1900 | 0.209 |
| 0.6418 | 2000 | 0.2327 |
| 0.6739 | 2100 | 0.2475 |
| 0.7060 | 2200 | 0.227 |
| 0.7381 | 2300 | 0.1992 |
| 0.7702 | 2400 | 0.2258 |
| 0.8023 | 2500 | 0.1676 |
| 0.8344 | 2600 | 0.2081 |
| 0.8665 | 2700 | 0.1966 |
| 0.8986 | 2800 | 0.218 |
| 0.9307 | 2900 | 0.1998 |
| 0.9628 | 3000 | 0.2157 |
| 0.9949 | 3100 | 0.2011 |
Framework Versions
- Python: 3.11.10
- Sentence Transformers: 5.2.0
- Transformers: 4.57.3
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 4.4.1
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
SpladeLoss
@misc{formal2022distillationhardnegativesampling,
title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
year={2022},
eprint={2205.04733},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2205.04733},
}
SparseMultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
FlopsLoss
@article{paria2020minimizing,
title={Minimizing flops to learn efficient sparse representations},
author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
journal={arXiv preprint arXiv:2004.05665},
year={2020}
}
- Downloads last month
- 8
Model tree for thierrydamiba/splade-ecommerce-multidomain
Base model
distilbert/distilbert-base-uncased