Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
dense
Generated from Trainer
dataset_size:111470
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use redis/model-b-structured with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use redis/model-b-structured with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("redis/model-b-structured") sentences = [ "when was the first elephant brought to america", "Old Bet The first elephant brought to the United States was in 1796, aboard the America which set sail from Calcutta for New York on December 3, 1795.[4] However, it is not certain that this was Old Bet.[2] The first references to Old Bet start in 1804 in Boston as part of a menagerie.[1] In 1808, while residing in Somers, New York, Hachaliah Bailey purchased the menagerie elephant for $1,000 and named it \"Old Bet\".[5][6]", "Cronus Rhea secretly gave birth to Zeus in Crete, and handed Cronus a stone wrapped in swaddling clothes, also known as the Omphalos Stone, which he promptly swallowed, thinking that it was his son.", "Renal artery One or two accessory renal arteries are frequently found, especially on the left side since they usually arise from the aorta, and may come off above (more common) or below the main artery. Instead of entering the kidney at the hilus, they usually pierce the upper or lower part of the organ." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Add new SentenceTransformer model
Browse files
README.md
CHANGED
|
@@ -9,32 +9,38 @@ tags:
|
|
| 9 |
- loss:MultipleNegativesRankingLoss
|
| 10 |
base_model: prajjwal1/bert-small
|
| 11 |
widget:
|
| 12 |
-
- source_sentence: How
|
|
|
|
| 13 |
sentences:
|
| 14 |
-
-
|
| 15 |
-
-
|
| 16 |
-
|
| 17 |
-
-
|
|
|
|
| 18 |
sentences:
|
| 19 |
-
-
|
| 20 |
-
-
|
| 21 |
-
- Can
|
| 22 |
-
|
| 23 |
-
|
|
|
|
| 24 |
sentences:
|
| 25 |
-
-
|
| 26 |
-
-
|
| 27 |
-
-
|
| 28 |
-
- source_sentence:
|
|
|
|
| 29 |
sentences:
|
| 30 |
-
-
|
| 31 |
-
|
| 32 |
-
-
|
| 33 |
-
-
|
|
|
|
|
|
|
| 34 |
sentences:
|
| 35 |
-
- What
|
| 36 |
-
-
|
| 37 |
-
-
|
| 38 |
pipeline_tag: sentence-similarity
|
| 39 |
library_name: sentence-transformers
|
| 40 |
---
|
|
@@ -85,12 +91,12 @@ Then you can load this model and run inference.
|
|
| 85 |
from sentence_transformers import SentenceTransformer
|
| 86 |
|
| 87 |
# Download from the 🤗 Hub
|
| 88 |
-
model = SentenceTransformer("
|
| 89 |
# Run inference
|
| 90 |
sentences = [
|
| 91 |
-
'
|
| 92 |
-
'
|
| 93 |
-
'
|
| 94 |
]
|
| 95 |
embeddings = model.encode(sentences)
|
| 96 |
print(embeddings.shape)
|
|
@@ -99,9 +105,9 @@ print(embeddings.shape)
|
|
| 99 |
# Get the similarity scores for the embeddings
|
| 100 |
similarities = model.similarity(embeddings, embeddings)
|
| 101 |
print(similarities)
|
| 102 |
-
# tensor([[1.0000,
|
| 103 |
-
# [0.
|
| 104 |
-
# [0.
|
| 105 |
```
|
| 106 |
|
| 107 |
<!--
|
|
@@ -147,18 +153,18 @@ You can finetune this model on your own dataset.
|
|
| 147 |
#### Unnamed Dataset
|
| 148 |
|
| 149 |
* Size: 100,000 training samples
|
| 150 |
-
* Columns: <code>
|
| 151 |
* Approximate statistics based on the first 1000 samples:
|
| 152 |
-
| |
|
| 153 |
-
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
| 154 |
-
| type | string | string | string
|
| 155 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 15.
|
| 156 |
* Samples:
|
| 157 |
-
|
|
| 158 |
-
|:---------------------------------------------------------
|
| 159 |
-
| <code>
|
| 160 |
-
| <code>
|
| 161 |
-
| <code>
|
| 162 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 163 |
```json
|
| 164 |
{
|
|
@@ -171,10 +177,20 @@ You can finetune this model on your own dataset.
|
|
| 171 |
### Training Hyperparameters
|
| 172 |
#### Non-Default Hyperparameters
|
| 173 |
|
| 174 |
-
- `per_device_train_batch_size`:
|
| 175 |
-
- `per_device_eval_batch_size`:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
- `fp16`: True
|
| 177 |
-
- `
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 178 |
|
| 179 |
#### All Hyperparameters
|
| 180 |
<details><summary>Click to expand</summary>
|
|
@@ -183,24 +199,24 @@ You can finetune this model on your own dataset.
|
|
| 183 |
- `do_predict`: False
|
| 184 |
- `eval_strategy`: no
|
| 185 |
- `prediction_loss_only`: True
|
| 186 |
-
- `per_device_train_batch_size`:
|
| 187 |
-
- `per_device_eval_batch_size`:
|
| 188 |
- `per_gpu_train_batch_size`: None
|
| 189 |
- `per_gpu_eval_batch_size`: None
|
| 190 |
- `gradient_accumulation_steps`: 1
|
| 191 |
- `eval_accumulation_steps`: None
|
| 192 |
- `torch_empty_cache_steps`: None
|
| 193 |
-
- `learning_rate`:
|
| 194 |
-
- `weight_decay`: 0.
|
| 195 |
- `adam_beta1`: 0.9
|
| 196 |
- `adam_beta2`: 0.999
|
| 197 |
- `adam_epsilon`: 1e-08
|
| 198 |
-
- `max_grad_norm`: 1
|
| 199 |
-
- `num_train_epochs`: 3
|
| 200 |
-
- `max_steps`:
|
| 201 |
- `lr_scheduler_type`: linear
|
| 202 |
- `lr_scheduler_kwargs`: {}
|
| 203 |
-
- `warmup_ratio`: 0.
|
| 204 |
- `warmup_steps`: 0
|
| 205 |
- `log_level`: passive
|
| 206 |
- `log_level_replica`: warning
|
|
@@ -228,9 +244,9 @@ You can finetune this model on your own dataset.
|
|
| 228 |
- `tpu_num_cores`: None
|
| 229 |
- `tpu_metrics_debug`: False
|
| 230 |
- `debug`: []
|
| 231 |
-
- `dataloader_drop_last`:
|
| 232 |
-
- `dataloader_num_workers`:
|
| 233 |
-
- `dataloader_prefetch_factor`:
|
| 234 |
- `past_index`: -1
|
| 235 |
- `disable_tqdm`: False
|
| 236 |
- `remove_unused_columns`: True
|
|
@@ -245,23 +261,23 @@ You can finetune this model on your own dataset.
|
|
| 245 |
- `parallelism_config`: None
|
| 246 |
- `deepspeed`: None
|
| 247 |
- `label_smoothing_factor`: 0.0
|
| 248 |
-
- `optim`:
|
| 249 |
- `optim_args`: None
|
| 250 |
- `adafactor`: False
|
| 251 |
- `group_by_length`: False
|
| 252 |
- `length_column_name`: length
|
| 253 |
- `project`: huggingface
|
| 254 |
- `trackio_space_id`: trackio
|
| 255 |
-
- `ddp_find_unused_parameters`:
|
| 256 |
- `ddp_bucket_cap_mb`: None
|
| 257 |
- `ddp_broadcast_buffers`: False
|
| 258 |
- `dataloader_pin_memory`: True
|
| 259 |
- `dataloader_persistent_workers`: False
|
| 260 |
- `skip_memory_metrics`: True
|
| 261 |
- `use_legacy_prediction_loop`: False
|
| 262 |
-
- `push_to_hub`:
|
| 263 |
- `resume_from_checkpoint`: None
|
| 264 |
-
- `hub_model_id`:
|
| 265 |
- `hub_strategy`: every_save
|
| 266 |
- `hub_private_repo`: None
|
| 267 |
- `hub_always_push`: False
|
|
@@ -295,7 +311,7 @@ You can finetune this model on your own dataset.
|
|
| 295 |
- `average_tokens_across_devices`: True
|
| 296 |
- `prompts`: None
|
| 297 |
- `batch_sampler`: batch_sampler
|
| 298 |
-
- `multi_dataset_batch_sampler`:
|
| 299 |
- `router_mapping`: {}
|
| 300 |
- `learning_rate_mapping`: {}
|
| 301 |
|
|
@@ -304,15 +320,17 @@ You can finetune this model on your own dataset.
|
|
| 304 |
### Training Logs
|
| 305 |
| Epoch | Step | Training Loss |
|
| 306 |
|:------:|:----:|:-------------:|
|
| 307 |
-
| 0.
|
| 308 |
-
| 0.
|
| 309 |
-
| 0.
|
| 310 |
-
| 1.
|
| 311 |
-
| 1.
|
| 312 |
-
| 1.
|
| 313 |
-
|
|
| 314 |
-
| 2.
|
| 315 |
-
| 2.
|
|
|
|
|
|
|
| 316 |
|
| 317 |
|
| 318 |
### Framework Versions
|
|
|
|
| 9 |
- loss:MultipleNegativesRankingLoss
|
| 10 |
base_model: prajjwal1/bert-small
|
| 11 |
widget:
|
| 12 |
+
- source_sentence: How would it effect our economy if we ban Chinese products in our
|
| 13 |
+
country.?
|
| 14 |
sentences:
|
| 15 |
+
- How would it effect our economy if we ban Chinese products in our country.?
|
| 16 |
+
- Which cities in India is suitable for part time teaching job where one can prepare
|
| 17 |
+
for civil services exam?
|
| 18 |
+
- What is the font used in the desktop version of Instagram for comments?
|
| 19 |
+
- source_sentence: Why do we need Java programming?
|
| 20 |
sentences:
|
| 21 |
+
- What is Java? What do I need it for?
|
| 22 |
+
- Which is the best website to prepare for the Infosys written test?
|
| 23 |
+
- Can I still get funding to study in UCL (Computer Vision Masters), if I graduated
|
| 24 |
+
with a BSc in Computer Science with a 2:1 and I have 3 years or web developent
|
| 25 |
+
expirience?
|
| 26 |
+
- source_sentence: What is capital of china?
|
| 27 |
sentences:
|
| 28 |
+
- How many businesses does Donald Trump own?
|
| 29 |
+
- What is capital of china?
|
| 30 |
+
- Where is the capital of China?
|
| 31 |
+
- source_sentence: My Xiaomi Redmi 2 all of a sudden got heated up and then turned
|
| 32 |
+
off. Now its not charging. What should I do?
|
| 33 |
sentences:
|
| 34 |
+
- My Xiaomi Redmi 2 all of a sudden got heated up and then turned off. Now its not
|
| 35 |
+
charging. I should Whatdo?
|
| 36 |
+
- How does the $9 computer work?
|
| 37 |
+
- My Xiaomi Redmi 2 all of a sudden got heated up and then turned off. Now its not
|
| 38 |
+
charging. What should I do?
|
| 39 |
+
- source_sentence: How can I get a job in Dubai if I am living in U.S?
|
| 40 |
sentences:
|
| 41 |
+
- What is the myth behind Mona Lisa smile?
|
| 42 |
+
- Which is the best series to watch after FRIENDS ?
|
| 43 |
+
- How can one get a job in Dubai?
|
| 44 |
pipeline_tag: sentence-similarity
|
| 45 |
library_name: sentence-transformers
|
| 46 |
---
|
|
|
|
| 91 |
from sentence_transformers import SentenceTransformer
|
| 92 |
|
| 93 |
# Download from the 🤗 Hub
|
| 94 |
+
model = SentenceTransformer("redis/model-b-structured")
|
| 95 |
# Run inference
|
| 96 |
sentences = [
|
| 97 |
+
'How can I get a job in Dubai if I am living in U.S?',
|
| 98 |
+
'How can one get a job in Dubai?',
|
| 99 |
+
'Which is the best series to watch after FRIENDS ?',
|
| 100 |
]
|
| 101 |
embeddings = model.encode(sentences)
|
| 102 |
print(embeddings.shape)
|
|
|
|
| 105 |
# Get the similarity scores for the embeddings
|
| 106 |
similarities = model.similarity(embeddings, embeddings)
|
| 107 |
print(similarities)
|
| 108 |
+
# tensor([[ 1.0000, 0.8904, -0.0302],
|
| 109 |
+
# [ 0.8904, 1.0000, 0.0224],
|
| 110 |
+
# [-0.0302, 0.0224, 1.0000]])
|
| 111 |
```
|
| 112 |
|
| 113 |
<!--
|
|
|
|
| 153 |
#### Unnamed Dataset
|
| 154 |
|
| 155 |
* Size: 100,000 training samples
|
| 156 |
+
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
| 157 |
* Approximate statistics based on the first 1000 samples:
|
| 158 |
+
| | anchor | positive | negative |
|
| 159 |
+
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
|
| 160 |
+
| type | string | string | string |
|
| 161 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 15.51 tokens</li><li>max: 97 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.34 tokens</li><li>max: 97 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 16.63 tokens</li><li>max: 128 tokens</li></ul> |
|
| 162 |
* Samples:
|
| 163 |
+
| anchor | positive | negative |
|
| 164 |
+
|:---------------------------------------------------------|:-------------------------------------------------------|:----------------------------------------------------------------------------------|
|
| 165 |
+
| <code>How do you trace a fake phone number?</code> | <code>How do you trace a fake phone number?</code> | <code>How do I trace an internet connection phone number from middle East?</code> |
|
| 166 |
+
| <code>How do I draw cartoon monsters?</code> | <code>How do I draw cartoon monsters?</code> | <code>How do cartoon monsters draw I?</code> |
|
| 167 |
+
| <code>Do you believe in an afterlife? If so, why?</code> | <code>Do you believe that there's an afterlife?</code> | <code>Do you believe not in an afterlife ? If so , why ?</code> |
|
| 168 |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
| 169 |
```json
|
| 170 |
{
|
|
|
|
| 177 |
### Training Hyperparameters
|
| 178 |
#### Non-Default Hyperparameters
|
| 179 |
|
| 180 |
+
- `per_device_train_batch_size`: 256
|
| 181 |
+
- `per_device_eval_batch_size`: 256
|
| 182 |
+
- `learning_rate`: 2e-05
|
| 183 |
+
- `weight_decay`: 0.001
|
| 184 |
+
- `max_steps`: 1170
|
| 185 |
+
- `warmup_ratio`: 0.1
|
| 186 |
- `fp16`: True
|
| 187 |
+
- `dataloader_drop_last`: True
|
| 188 |
+
- `dataloader_num_workers`: 1
|
| 189 |
+
- `dataloader_prefetch_factor`: 1
|
| 190 |
+
- `optim`: adamw_torch
|
| 191 |
+
- `ddp_find_unused_parameters`: False
|
| 192 |
+
- `push_to_hub`: True
|
| 193 |
+
- `hub_model_id`: redis/model-b-structured
|
| 194 |
|
| 195 |
#### All Hyperparameters
|
| 196 |
<details><summary>Click to expand</summary>
|
|
|
|
| 199 |
- `do_predict`: False
|
| 200 |
- `eval_strategy`: no
|
| 201 |
- `prediction_loss_only`: True
|
| 202 |
+
- `per_device_train_batch_size`: 256
|
| 203 |
+
- `per_device_eval_batch_size`: 256
|
| 204 |
- `per_gpu_train_batch_size`: None
|
| 205 |
- `per_gpu_eval_batch_size`: None
|
| 206 |
- `gradient_accumulation_steps`: 1
|
| 207 |
- `eval_accumulation_steps`: None
|
| 208 |
- `torch_empty_cache_steps`: None
|
| 209 |
+
- `learning_rate`: 2e-05
|
| 210 |
+
- `weight_decay`: 0.001
|
| 211 |
- `adam_beta1`: 0.9
|
| 212 |
- `adam_beta2`: 0.999
|
| 213 |
- `adam_epsilon`: 1e-08
|
| 214 |
+
- `max_grad_norm`: 1.0
|
| 215 |
+
- `num_train_epochs`: 3.0
|
| 216 |
+
- `max_steps`: 1170
|
| 217 |
- `lr_scheduler_type`: linear
|
| 218 |
- `lr_scheduler_kwargs`: {}
|
| 219 |
+
- `warmup_ratio`: 0.1
|
| 220 |
- `warmup_steps`: 0
|
| 221 |
- `log_level`: passive
|
| 222 |
- `log_level_replica`: warning
|
|
|
|
| 244 |
- `tpu_num_cores`: None
|
| 245 |
- `tpu_metrics_debug`: False
|
| 246 |
- `debug`: []
|
| 247 |
+
- `dataloader_drop_last`: True
|
| 248 |
+
- `dataloader_num_workers`: 1
|
| 249 |
+
- `dataloader_prefetch_factor`: 1
|
| 250 |
- `past_index`: -1
|
| 251 |
- `disable_tqdm`: False
|
| 252 |
- `remove_unused_columns`: True
|
|
|
|
| 261 |
- `parallelism_config`: None
|
| 262 |
- `deepspeed`: None
|
| 263 |
- `label_smoothing_factor`: 0.0
|
| 264 |
+
- `optim`: adamw_torch
|
| 265 |
- `optim_args`: None
|
| 266 |
- `adafactor`: False
|
| 267 |
- `group_by_length`: False
|
| 268 |
- `length_column_name`: length
|
| 269 |
- `project`: huggingface
|
| 270 |
- `trackio_space_id`: trackio
|
| 271 |
+
- `ddp_find_unused_parameters`: False
|
| 272 |
- `ddp_bucket_cap_mb`: None
|
| 273 |
- `ddp_broadcast_buffers`: False
|
| 274 |
- `dataloader_pin_memory`: True
|
| 275 |
- `dataloader_persistent_workers`: False
|
| 276 |
- `skip_memory_metrics`: True
|
| 277 |
- `use_legacy_prediction_loop`: False
|
| 278 |
+
- `push_to_hub`: True
|
| 279 |
- `resume_from_checkpoint`: None
|
| 280 |
+
- `hub_model_id`: redis/model-b-structured
|
| 281 |
- `hub_strategy`: every_save
|
| 282 |
- `hub_private_repo`: None
|
| 283 |
- `hub_always_push`: False
|
|
|
|
| 311 |
- `average_tokens_across_devices`: True
|
| 312 |
- `prompts`: None
|
| 313 |
- `batch_sampler`: batch_sampler
|
| 314 |
+
- `multi_dataset_batch_sampler`: proportional
|
| 315 |
- `router_mapping`: {}
|
| 316 |
- `learning_rate_mapping`: {}
|
| 317 |
|
|
|
|
| 320 |
### Training Logs
|
| 321 |
| Epoch | Step | Training Loss |
|
| 322 |
|:------:|:----:|:-------------:|
|
| 323 |
+
| 0.2564 | 100 | 1.0792 |
|
| 324 |
+
| 0.5128 | 200 | 0.2584 |
|
| 325 |
+
| 0.7692 | 300 | 0.1967 |
|
| 326 |
+
| 1.0256 | 400 | 0.1808 |
|
| 327 |
+
| 1.2821 | 500 | 0.1528 |
|
| 328 |
+
| 1.5385 | 600 | 0.1471 |
|
| 329 |
+
| 1.7949 | 700 | 0.1416 |
|
| 330 |
+
| 2.0513 | 800 | 0.1363 |
|
| 331 |
+
| 2.3077 | 900 | 0.1259 |
|
| 332 |
+
| 2.5641 | 1000 | 0.1219 |
|
| 333 |
+
| 2.8205 | 1100 | 0.1212 |
|
| 334 |
|
| 335 |
|
| 336 |
### Framework Versions
|