rasyosef commited on
Commit
5e1cebb
·
verified ·
1 Parent(s): c5eff42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -43
README.md CHANGED
@@ -9,25 +9,31 @@ tags:
9
  - loss:SpladeLoss
10
  - loss:SparseMarginMSELoss
11
  - loss:FlopsLoss
12
- base_model: yosefw/SPLADE-BERT-Small-BS128
 
13
  widget:
14
- - text: Donate to the Breast Cancer Research Foundation Now BCRF is the largest nonprofit
15
- funder of breast cancer research worldwide. Over the years, it has raised more
16
- than half a billion dollars in support of research that has made a major impact
17
- on how we view and treat breast cancer.
18
- - text: Macular degeneration—Loss of central vision, blurred vision (especially while
19
- reading), distorted vision (like seeing wavy lines), and colors appearing faded.
20
- The most common cause of blindness in people over age 60. Eye infection, inflammation,
21
- or injury.
 
 
22
  - text: how do i find the tongue weight of a trailer?
23
- - text: Feathers (1-3) Pidgey are docile Pokémon, and generally prefer to flee from
24
- their enemies rather than fight them. Pidgey's small size permits it to hide easily
25
- in long grass, where it is typically found foraging for small insects. It is known
26
- to flush out potential prey from long grass by flapping its wings rapidly.
27
- - text: 10 hilariously insightful foreign words. One of the most obvious differences
28
- between cognac and whiskey is that cognac makers use grapes, and whiskey makers
29
- use grains. Although both processes use fermentation to create the liquors, cognac
30
- makers use a double distillation process.
 
 
 
31
  pipeline_tag: feature-extraction
32
  library_name: sentence-transformers
33
  metrics:
@@ -117,38 +123,34 @@ model-index:
117
  - type: corpus_sparsity_ratio
118
  value: 0.9944841805248121
119
  name: Corpus Sparsity Ratio
 
 
 
 
 
120
  ---
121
 
122
- # SPLADE Sparse Encoder
123
 
124
- This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [yosefw/SPLADE-BERT-Small-BS128](https://huggingface.co/yosefw/SPLADE-BERT-Small-BS128) using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
125
- ## Model Details
126
 
127
- ### Model Description
128
- - **Model Type:** SPLADE Sparse Encoder
129
- - **Base model:** [yosefw/SPLADE-BERT-Small-BS128](https://huggingface.co/yosefw/SPLADE-BERT-Small-BS128) <!-- at revision 27575d2504e7400b5ed11f94d0e162e3e7c01af6 -->
130
- - **Maximum Sequence Length:** 512 tokens
131
- - **Output Dimensionality:** 30522 dimensions
132
- - **Similarity Function:** Dot Product
133
- <!-- - **Training Dataset:** Unknown -->
134
- <!-- - **Language:** Unknown -->
135
- <!-- - **License:** Unknown -->
136
 
137
- ### Model Sources
 
 
138
 
139
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
140
- - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
141
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
142
- - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
143
 
144
- ### Full Model Architecture
145
 
146
- ```
147
- SparseEncoder(
148
- (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
149
- (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
150
- )
151
- ```
 
152
 
153
  ## Usage
154
 
@@ -165,7 +167,7 @@ Then you can load this model and run inference.
165
  from sentence_transformers import SparseEncoder
166
 
167
  # Download from the 🤗 Hub
168
- model = SparseEncoder("yosefw/SPLADE-BERT-Small-BS128-distil")
169
  # Run inference
170
  queries = [
171
  "is cognac whisky",
@@ -210,6 +212,37 @@ You can finetune this model on your own dataset.
210
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
211
  -->
212
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
213
  ## Evaluation
214
 
215
  ### Metrics
@@ -506,4 +539,5 @@ You can finetune this model on your own dataset.
506
  ## Model Card Contact
507
 
508
  *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
509
- -->
 
 
9
  - loss:SpladeLoss
10
  - loss:SparseMarginMSELoss
11
  - loss:FlopsLoss
12
+ base_model:
13
+ - prajjwal1/bert-small
14
  widget:
15
+ - text: >-
16
+ Donate to the Breast Cancer Research Foundation Now BCRF is the largest
17
+ nonprofit funder of breast cancer research worldwide. Over the years, it has
18
+ raised more than half a billion dollars in support of research that has made
19
+ a major impact on how we view and treat breast cancer.
20
+ - text: >-
21
+ Macular degeneration—Loss of central vision, blurred vision (especially
22
+ while reading), distorted vision (like seeing wavy lines), and colors
23
+ appearing faded. The most common cause of blindness in people over age 60.
24
+ Eye infection, inflammation, or injury.
25
  - text: how do i find the tongue weight of a trailer?
26
+ - text: >-
27
+ Feathers (1-3) Pidgey are docile Pokémon, and generally prefer to flee from
28
+ their enemies rather than fight them. Pidgey's small size permits it to hide
29
+ easily in long grass, where it is typically found foraging for small
30
+ insects. It is known to flush out potential prey from long grass by flapping
31
+ its wings rapidly.
32
+ - text: >-
33
+ 10 hilariously insightful foreign words. One of the most obvious differences
34
+ between cognac and whiskey is that cognac makers use grapes, and whiskey
35
+ makers use grains. Although both processes use fermentation to create the
36
+ liquors, cognac makers use a double distillation process.
37
  pipeline_tag: feature-extraction
38
  library_name: sentence-transformers
39
  metrics:
 
123
  - type: corpus_sparsity_ratio
124
  value: 0.9944841805248121
125
  name: Corpus Sparsity Ratio
126
+ license: mit
127
+ datasets:
128
+ - microsoft/ms_marco
129
+ language:
130
+ - en
131
  ---
132
 
133
+ # SPLADE-BERT-Small-Distil
134
 
135
+ This is a SPLADE sparse retrieval model based on BERT-Small (29M) that was trained by distilling a Cross-Encoder on the MSMARCO dataset. The cross-encoder used was [ms-marco-MiniLM-L6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2).
 
136
 
137
+ This tiny SPLADE model is `2x` smaller than Naver's official `splade-v3-distilbert` while having `91%` of it's performance on the MSMARCO benchmark. This model is small enough to be used without a GPU on a dataset of a few thousand documents.
 
 
 
 
 
 
 
 
138
 
139
+ - `Collection:` https://huggingface.co/collections/rasyosef/splade-tiny-msmarco-687c548c0691d95babf65b70
140
+ - `Distillation Dataset:` https://huggingface.co/datasets/yosefw/msmarco-train-distil-v2
141
+ - `Code:` https://github.com/rasyosef/splade-tiny-msmarco
142
 
143
+ ## Performance
 
 
 
144
 
145
+ The splade models were evaluated on 55 thousand queries and 8.84 million documents from the [MSMARCO](https://huggingface.co/datasets/microsoft/ms_marco) dataset.
146
 
147
+ ||Size (# Params)|MRR@10 (MS MARCO dev)|
148
+ |:---|:----|:-------------------|
149
+ |`BM25`|-|18.0|-|-|
150
+ |`rasyosef/splade-tiny`|4.4M|30.9|
151
+ |`rasyosef/splade-mini`|11.2M|33.2|
152
+ |`rasyosef/splade-small`|28.8M|35.2|
153
+ |`naver/splade-v3-distilbert`|67.0M|38.7|
154
 
155
  ## Usage
156
 
 
167
  from sentence_transformers import SparseEncoder
168
 
169
  # Download from the 🤗 Hub
170
+ model = SparseEncoder("rasyosef/splade-small")
171
  # Run inference
172
  queries = [
173
  "is cognac whisky",
 
212
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
213
  -->
214
 
215
+ ## Model Details
216
+
217
+ ### Model Description
218
+ - **Model Type:** SPLADE Sparse Encoder
219
+ - **Base model:** [prajjwal1/bert-small](https://huggingface.co/prajjwal1/bert-small) <!-- at revision 27575d2504e7400b5ed11f94d0e162e3e7c01af6 -->
220
+ - **Maximum Sequence Length:** 512 tokens
221
+ - **Output Dimensionality:** 30522 dimensions
222
+ - **Similarity Function:** Dot Product
223
+ <!-- - **Training Dataset:** Unknown -->
224
+ <!-- - **Language:** Unknown -->
225
+ <!-- - **License:** Unknown -->
226
+
227
+ ### Model Sources
228
+
229
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
230
+ - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
231
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
232
+ - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
233
+
234
+ ### Full Model Architecture
235
+
236
+ ```
237
+ SparseEncoder(
238
+ (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertForMaskedLM'})
239
+ (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
240
+ )
241
+ ```
242
+
243
+ ## More
244
+ <details><summary>Click to expand</summary>
245
+
246
  ## Evaluation
247
 
248
  ### Metrics
 
539
  ## Model Card Contact
540
 
541
  *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
542
+ -->
543
+ <details>