LamaDiab commited on
Commit
94557ef
·
verified ·
1 Parent(s): 3ea8f48

Updating model weights

Browse files
Files changed (1) hide show
  1. README.md +31 -43
README.md CHANGED
@@ -7,7 +7,6 @@ tags:
7
  - generated_from_trainer
8
  - dataset_size:554030
9
  - loss:MultipleNegativesSymmetricRankingLoss
10
- base_model: rebego/stsb-all-MiniLM-L6-v2
11
  widget:
12
  - source_sentence: pacman smoked turkey
13
  sentences:
@@ -43,7 +42,7 @@ library_name: sentence-transformers
43
  metrics:
44
  - cosine_accuracy
45
  model-index:
46
- - name: SentenceTransformer based on rebego/stsb-all-MiniLM-L6-v2
47
  results:
48
  - task:
49
  type: triplet
@@ -53,22 +52,19 @@ model-index:
53
  type: unknown
54
  metrics:
55
  - type: cosine_accuracy
56
- value: 0.9596002101898193
57
- name: Cosine Accuracy
58
- - type: cosine_accuracy
59
- value: 0.8801550269126892
60
  name: Cosine Accuracy
61
  ---
62
 
63
- # SentenceTransformer based on rebego/stsb-all-MiniLM-L6-v2
64
 
65
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [rebego/stsb-all-MiniLM-L6-v2](https://huggingface.co/rebego/stsb-all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
66
 
67
  ## Model Details
68
 
69
  ### Model Description
70
  - **Model Type:** Sentence Transformer
71
- - **Base model:** [rebego/stsb-all-MiniLM-L6-v2](https://huggingface.co/rebego/stsb-all-MiniLM-L6-v2) <!-- at revision db58f9a2537bc2b56ee784347b8eaa44cb383d70 -->
72
  - **Maximum Sequence Length:** 512 tokens
73
  - **Output Dimensionality:** 384 dimensions
74
  - **Similarity Function:** Cosine Similarity
@@ -120,9 +116,9 @@ print(embeddings.shape)
120
  # Get the similarity scores for the embeddings
121
  similarities = model.similarity(embeddings, embeddings)
122
  print(similarities)
123
- # tensor([[1.0000, 0.3390, 0.3114],
124
- # [0.3390, 1.0000, 0.7184],
125
- # [0.3114, 0.7184, 1.0000]])
126
  ```
127
 
128
  <!--
@@ -157,17 +153,9 @@ You can finetune this model on your own dataset.
157
 
158
  * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
159
 
160
- | Metric | Value |
161
- |:--------------------|:-----------|
162
- | **cosine_accuracy** | **0.9596** |
163
-
164
- #### Triplet
165
-
166
- * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
167
-
168
- | Metric | Value |
169
- |:--------------------|:-----------|
170
- | **cosine_accuracy** | **0.8802** |
171
 
172
  <!--
173
  ## Bias, Risks and Limitations
@@ -243,6 +231,7 @@ You can finetune this model on your own dataset.
243
  - `per_device_eval_batch_size`: 256
244
  - `learning_rate`: 2e-05
245
  - `weight_decay`: 0.001
 
246
  - `warmup_steps`: 2596
247
  - `fp16`: True
248
  - `dataloader_num_workers`: 1
@@ -273,7 +262,7 @@ You can finetune this model on your own dataset.
273
  - `adam_beta2`: 0.999
274
  - `adam_epsilon`: 1e-08
275
  - `max_grad_norm`: 1.0
276
- - `num_train_epochs`: 3
277
  - `max_steps`: -1
278
  - `lr_scheduler_type`: linear
279
  - `lr_scheduler_kwargs`: {}
@@ -377,25 +366,24 @@ You can finetune this model on your own dataset.
377
  </details>
378
 
379
  ### Training Logs
380
- | Epoch | Step | Training Loss | Validation Loss | cosine_accuracy |
381
- |:------:|:----:|:-------------:|:---------------:|:---------------:|
382
- | -1 | -1 | - | - | 0.8802 |
383
- | 0.0005 | 1 | 4.0583 | - | - |
384
- | 0.2309 | 500 | - | 1.4611 | 0.9421 |
385
- | 0.4619 | 1000 | - | 1.3612 | 0.9457 |
386
- | 0.6928 | 1500 | - | 1.2883 | 0.9529 |
387
- | 0.9238 | 2000 | - | 1.2684 | 0.9522 |
388
- | 1.0 | 2165 | 2.6124 | - | - |
389
- | 1.1547 | 2500 | - | 1.2560 | 0.9541 |
390
- | 1.3857 | 3000 | - | 1.1885 | 0.9562 |
391
- | 1.6166 | 3500 | - | 1.1879 | 0.9557 |
392
- | 1.8476 | 4000 | - | 1.1555 | 0.9580 |
393
- | 2.0 | 4330 | 1.986 | - | - |
394
- | 2.0785 | 4500 | - | 1.1547 | 0.9582 |
395
- | 2.3095 | 5000 | - | 1.1456 | 0.9584 |
396
- | 2.5404 | 5500 | - | 1.1358 | 0.9585 |
397
- | 2.7714 | 6000 | - | 1.1279 | 0.9596 |
398
- | 3.0 | 6495 | 1.8005 | - | - |
399
 
400
 
401
  ### Framework Versions
 
7
  - generated_from_trainer
8
  - dataset_size:554030
9
  - loss:MultipleNegativesSymmetricRankingLoss
 
10
  widget:
11
  - source_sentence: pacman smoked turkey
12
  sentences:
 
42
  metrics:
43
  - cosine_accuracy
44
  model-index:
45
+ - name: SentenceTransformer
46
  results:
47
  - task:
48
  type: triplet
 
52
  type: unknown
53
  metrics:
54
  - type: cosine_accuracy
55
+ value: 0.9600210189819336
 
 
 
56
  name: Cosine Accuracy
57
  ---
58
 
59
+ # SentenceTransformer
60
 
61
+ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
62
 
63
  ## Model Details
64
 
65
  ### Model Description
66
  - **Model Type:** Sentence Transformer
67
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
68
  - **Maximum Sequence Length:** 512 tokens
69
  - **Output Dimensionality:** 384 dimensions
70
  - **Similarity Function:** Cosine Similarity
 
116
  # Get the similarity scores for the embeddings
117
  similarities = model.similarity(embeddings, embeddings)
118
  print(similarities)
119
+ # tensor([[1.0000, 0.3351, 0.3300],
120
+ # [0.3351, 1.0000, 0.7113],
121
+ # [0.3300, 0.7113, 1.0000]])
122
  ```
123
 
124
  <!--
 
153
 
154
  * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
155
 
156
+ | Metric | Value |
157
+ |:--------------------|:---------|
158
+ | **cosine_accuracy** | **0.96** |
 
 
 
 
 
 
 
 
159
 
160
  <!--
161
  ## Bias, Risks and Limitations
 
231
  - `per_device_eval_batch_size`: 256
232
  - `learning_rate`: 2e-05
233
  - `weight_decay`: 0.001
234
+ - `num_train_epochs`: 6
235
  - `warmup_steps`: 2596
236
  - `fp16`: True
237
  - `dataloader_num_workers`: 1
 
262
  - `adam_beta2`: 0.999
263
  - `adam_epsilon`: 1e-08
264
  - `max_grad_norm`: 1.0
265
+ - `num_train_epochs`: 6
266
  - `max_steps`: -1
267
  - `lr_scheduler_type`: linear
268
  - `lr_scheduler_kwargs`: {}
 
366
  </details>
367
 
368
  ### Training Logs
369
+ | Epoch | Step | Training Loss | Validation Loss | cosine_accuracy |
370
+ |:------:|:-----:|:-------------:|:---------------:|:---------------:|
371
+ | 3.0023 | 6500 | - | 1.1430 | 0.9588 |
372
+ | 3.2333 | 7000 | - | 1.1254 | 0.9590 |
373
+ | 3.4642 | 7500 | - | 1.1334 | 0.9603 |
374
+ | 3.6952 | 8000 | - | 1.1090 | 0.9599 |
375
+ | 3.9261 | 8500 | - | 1.1000 | 0.9602 |
376
+ | 4.0 | 8660 | 1.7181 | - | - |
377
+ | 4.1570 | 9000 | - | 1.1028 | 0.9587 |
378
+ | 4.3880 | 9500 | - | 1.1046 | 0.9592 |
379
+ | 4.6189 | 10000 | - | 1.0984 | 0.9596 |
380
+ | 4.8499 | 10500 | - | 1.0925 | 0.9598 |
381
+ | 5.0 | 10825 | 1.6411 | - | - |
382
+ | 5.0808 | 11000 | - | 1.0932 | 0.9600 |
383
+ | 5.3118 | 11500 | - | 1.0890 | 0.9596 |
384
+ | 5.5427 | 12000 | - | 1.0831 | 0.9600 |
385
+ | 5.7737 | 12500 | - | 1.0858 | 0.9600 |
386
+ | 6.0 | 12990 | 1.6083 | - | - |
 
387
 
388
 
389
  ### Framework Versions