LamaDiab
/

STSBMiniLM-V9Data-256BATCH-SemanticEngine

@@ -7,7 +7,6 @@ tags:
 - generated_from_trainer
 - dataset_size:554030
 - loss:MultipleNegativesSymmetricRankingLoss
-base_model: rebego/stsb-all-MiniLM-L6-v2
 widget:
 - source_sentence: pacman smoked turkey
   sentences:
@@ -43,7 +42,7 @@ library_name: sentence-transformers
 metrics:
 - cosine_accuracy
 model-index:
-- name: SentenceTransformer based on rebego/stsb-all-MiniLM-L6-v2
   results:
   - task:
       type: triplet
@@ -53,22 +52,19 @@ model-index:
       type: unknown
     metrics:
     - type: cosine_accuracy
-      value: 0.9596002101898193
-      name: Cosine Accuracy
-    - type: cosine_accuracy
-      value: 0.8801550269126892
       name: Cosine Accuracy
 ---
-# SentenceTransformer based on rebego/stsb-all-MiniLM-L6-v2
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [rebego/stsb-all-MiniLM-L6-v2](https://huggingface.co/rebego/stsb-all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
-- **Base model:** [rebego/stsb-all-MiniLM-L6-v2](https://huggingface.co/rebego/stsb-all-MiniLM-L6-v2) <!-- at revision db58f9a2537bc2b56ee784347b8eaa44cb383d70 -->
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 384 dimensions
 - **Similarity Function:** Cosine Similarity
@@ -120,9 +116,9 @@ print(embeddings.shape)
 # Get the similarity scores for the embeddings
 similarities = model.similarity(embeddings, embeddings)
 print(similarities)
-# tensor([[1.0000, 0.3390, 0.3114],
-#         [0.3390, 1.0000, 0.7184],
-#         [0.3114, 0.7184, 1.0000]])
 ```
 <!--
@@ -157,17 +153,9 @@ You can finetune this model on your own dataset.
 * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
-| Metric              | Value      |
-|:--------------------|:-----------|
-| **cosine_accuracy** | **0.9596** |
-#### Triplet
-* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
-| Metric              | Value      |
-|:--------------------|:-----------|
-| **cosine_accuracy** | **0.8802** |
 <!--
 ## Bias, Risks and Limitations
@@ -243,6 +231,7 @@ You can finetune this model on your own dataset.
 - `per_device_eval_batch_size`: 256
 - `learning_rate`: 2e-05
 - `weight_decay`: 0.001
 - `warmup_steps`: 2596
 - `fp16`: True
 - `dataloader_num_workers`: 1
@@ -273,7 +262,7 @@ You can finetune this model on your own dataset.
 - `adam_beta2`: 0.999
 - `adam_epsilon`: 1e-08
 - `max_grad_norm`: 1.0
-- `num_train_epochs`: 3
 - `max_steps`: -1
 - `lr_scheduler_type`: linear
 - `lr_scheduler_kwargs`: {}
@@ -377,25 +366,24 @@ You can finetune this model on your own dataset.
 </details>
 ### Training Logs
-| Epoch  | Step | Training Loss | Validation Loss | cosine_accuracy |
-|:------:|:----:|:-------------:|:---------------:|:---------------:|
-| -1     | -1   | -             | -               | 0.8802          |
-| 0.0005 | 1    | 4.0583        | -               | -               |
-| 0.2309 | 500  | -             | 1.4611          | 0.9421          |
-| 0.4619 | 1000 | -             | 1.3612          | 0.9457          |
-| 0.6928 | 1500 | -             | 1.2883          | 0.9529          |
-| 0.9238 | 2000 | -             | 1.2684          | 0.9522          |
-| 1.0    | 2165 | 2.6124        | -               | -               |
-| 1.1547 | 2500 | -             | 1.2560          | 0.9541          |
-| 1.3857 | 3000 | -             | 1.1885          | 0.9562          |
-| 1.6166 | 3500 | -             | 1.1879          | 0.9557          |
-| 1.8476 | 4000 | -             | 1.1555          | 0.9580          |
-| 2.0    | 4330 | 1.986         | -               | -               |
-| 2.0785 | 4500 | -             | 1.1547          | 0.9582          |
-| 2.3095 | 5000 | -             | 1.1456          | 0.9584          |
-| 2.5404 | 5500 | -             | 1.1358          | 0.9585          |
-| 2.7714 | 6000 | -             | 1.1279          | 0.9596          |
-| 3.0    | 6495 | 1.8005        | -               | -               |
 ### Framework Versions

 - generated_from_trainer
 - dataset_size:554030
 - loss:MultipleNegativesSymmetricRankingLoss
 widget:
 - source_sentence: pacman smoked turkey
   sentences:
 metrics:
 - cosine_accuracy
 model-index:
+- name: SentenceTransformer
   results:
   - task:
       type: triplet
       type: unknown
     metrics:
     - type: cosine_accuracy
+      value: 0.9600210189819336
       name: Cosine Accuracy
 ---
+# SentenceTransformer
+This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 ### Model Description
 - **Model Type:** Sentence Transformer
+<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 384 dimensions
 - **Similarity Function:** Cosine Similarity
 # Get the similarity scores for the embeddings
 similarities = model.similarity(embeddings, embeddings)
 print(similarities)
+# tensor([[1.0000, 0.3351, 0.3300],
+#         [0.3351, 1.0000, 0.7113],
+#         [0.3300, 0.7113, 1.0000]])
 ```
 <!--
 * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
+| Metric              | Value    |
+|:--------------------|:---------|
+| **cosine_accuracy** | **0.96** |
 <!--
 ## Bias, Risks and Limitations
 - `per_device_eval_batch_size`: 256
 - `learning_rate`: 2e-05
 - `weight_decay`: 0.001
+- `num_train_epochs`: 6
 - `warmup_steps`: 2596
 - `fp16`: True
 - `dataloader_num_workers`: 1
 - `adam_beta2`: 0.999
 - `adam_epsilon`: 1e-08
 - `max_grad_norm`: 1.0
+- `num_train_epochs`: 6
 - `max_steps`: -1
 - `lr_scheduler_type`: linear
 - `lr_scheduler_kwargs`: {}
 </details>
 ### Training Logs
+| Epoch  | Step  | Training Loss | Validation Loss | cosine_accuracy |
+|:------:|:-----:|:-------------:|:---------------:|:---------------:|
+| 3.0023 | 6500  | -             | 1.1430          | 0.9588          |
+| 3.2333 | 7000  | -             | 1.1254          | 0.9590          |
+| 3.4642 | 7500  | -             | 1.1334          | 0.9603          |
+| 3.6952 | 8000  | -             | 1.1090          | 0.9599          |
+| 3.9261 | 8500  | -             | 1.1000          | 0.9602          |
+| 4.0    | 8660  | 1.7181        | -               | -               |
+| 4.1570 | 9000  | -             | 1.1028          | 0.9587          |
+| 4.3880 | 9500  | -             | 1.1046          | 0.9592          |
+| 4.6189 | 10000 | -             | 1.0984          | 0.9596          |
+| 4.8499 | 10500 | -             | 1.0925          | 0.9598          |
+| 5.0    | 10825 | 1.6411        | -               | -               |
+| 5.0808 | 11000 | -             | 1.0932          | 0.9600          |
+| 5.3118 | 11500 | -             | 1.0890          | 0.9596          |
+| 5.5427 | 12000 | -             | 1.0831          | 0.9600          |
+| 5.7737 | 12500 | -             | 1.0858          | 0.9600          |
+| 6.0    | 12990 | 1.6083        | -               | -               |
 ### Framework Versions