Locutusque
/

LocutusqueXFelladrin-TinyMistral248M-Instruct

@@ -1,32 +1,28 @@
 ---
 license: apache-2.0
 datasets:
 - Locutusque/inst_mix_v2_top_100k
-language:
-- en
 pipeline_tag: text-generation
 widget:
-- text: >-
-    <|USER|> Design a Neo4j database and Cypher function snippet to Display
-    Extreme Dental hygiene: Using Mouthwash for Analysis for Beginners.
-    Implement if/else or switch/case statements to handle different conditions
-    related to the Consent. Provide detailed comments explaining your control
-    flow and the reasoning behind each decision. <|ASSISTANT|>
 - text: '<|USER|> Write me a story about a magical place. <|ASSISTANT|> '
-- text: >-
-    <|USER|> Write me an essay about the life of George Washington
-    <|ASSISTANT|>
 - text: '<|USER|> Solve the following equation 2x + 10 = 20 <|ASSISTANT|> '
-- text: >-
-    <|USER|> Craft me a list of some nice places to visit around the world.
-    <|ASSISTANT|>
-- text: >-
-    <|USER|> How to manage a lazy employee: Address the employee verbally. Don't
-    allow an employee's laziness or lack of enthusiasm to become a recurring
-    issue. Tell the employee you're hoping to speak with them about workplace
-    expectations and performance, and schedule a time to sit down together.
-    Question: To manage a lazy employee, it is suggested to talk to the
-    employee. True, False, or Neither? <|ASSISTANT|>
 inference:
   parameters:
     temperature: 0.5
@@ -35,8 +31,109 @@ inference:
     top_k: 30
     max_new_tokens: 250
     repetition_penalty: 1.15
-tags:
-- merge
 ---
 # LocutusqueXFelladrin-TinyMistral248M-Instruct
 This model was created by merging Locutusque/TinyMistral-248M-Instruct and Felladrin/TinyMistral-248M-SFT-v4 using mergekit. After the two models were merged, the resulting model was further trained on ~20,000 examples on the Locutusque/inst_mix_v2_top_100k at a low learning rate to further normalize weights. The following is the YAML config used to merge:
@@ -56,4 +153,17 @@ dtype: float16
 The resulting model combines the best of both worlds. With Locutusque/TinyMistral-248M-Instruct's coding capabilities and reasoning skills, and Felladrin/TinyMistral-248M-SFT-v4's low hallucination and instruction-following capabilities. The resulting model has an incredible performance considering its size.
 ## Evaluation
-Found in the Open LLM Leaderboard.

 ---
+language:
+- en
 license: apache-2.0
+tags:
+- merge
 datasets:
 - Locutusque/inst_mix_v2_top_100k
 pipeline_tag: text-generation
 widget:
+- text: '<|USER|> Design a Neo4j database and Cypher function snippet to Display Extreme
+    Dental hygiene: Using Mouthwash for Analysis for Beginners. Implement if/else
+    or switch/case statements to handle different conditions related to the Consent.
+    Provide detailed comments explaining your control flow and the reasoning behind
+    each decision. <|ASSISTANT|> '
 - text: '<|USER|> Write me a story about a magical place. <|ASSISTANT|> '
+- text: '<|USER|> Write me an essay about the life of George Washington <|ASSISTANT|> '
 - text: '<|USER|> Solve the following equation 2x + 10 = 20 <|ASSISTANT|> '
+- text: '<|USER|> Craft me a list of some nice places to visit around the world. <|ASSISTANT|> '
+- text: '<|USER|> How to manage a lazy employee: Address the employee verbally. Don''t
+    allow an employee''s laziness or lack of enthusiasm to become a recurring issue.
+    Tell the employee you''re hoping to speak with them about workplace expectations
+    and performance, and schedule a time to sit down together. Question: To manage
+    a lazy employee, it is suggested to talk to the employee. True, False, or Neither?
+    <|ASSISTANT|> '
 inference:
   parameters:
     temperature: 0.5
     top_k: 30
     max_new_tokens: 250
     repetition_penalty: 1.15
+model-index:
+- name: LocutusqueXFelladrin-TinyMistral248M-Instruct
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge (25-Shot)
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: acc_norm
+      value: 24.74
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/LocutusqueXFelladrin-TinyMistral248M-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HellaSwag (10-Shot)
+      type: hellaswag
+      split: validation
+      args:
+        num_few_shot: 10
+    metrics:
+    - type: acc_norm
+      value: 27.79
+      name: normalized accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/LocutusqueXFelladrin-TinyMistral248M-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU (5-Shot)
+      type: cais/mmlu
+      config: all
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 26.12
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/LocutusqueXFelladrin-TinyMistral248M-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: TruthfulQA (0-shot)
+      type: truthful_qa
+      config: multiple_choice
+      split: validation
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: mc2
+      value: 40.12
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/LocutusqueXFelladrin-TinyMistral248M-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Winogrande (5-shot)
+      type: winogrande
+      config: winogrande_xl
+      split: validation
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 49.09
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/LocutusqueXFelladrin-TinyMistral248M-Instruct
+      name: Open LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k (5-shot)
+      type: gsm8k
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 0.0
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/LocutusqueXFelladrin-TinyMistral248M-Instruct
+      name: Open LLM Leaderboard
 ---
 # LocutusqueXFelladrin-TinyMistral248M-Instruct
 This model was created by merging Locutusque/TinyMistral-248M-Instruct and Felladrin/TinyMistral-248M-SFT-v4 using mergekit. After the two models were merged, the resulting model was further trained on ~20,000 examples on the Locutusque/inst_mix_v2_top_100k at a low learning rate to further normalize weights. The following is the YAML config used to merge:
 The resulting model combines the best of both worlds. With Locutusque/TinyMistral-248M-Instruct's coding capabilities and reasoning skills, and Felladrin/TinyMistral-248M-SFT-v4's low hallucination and instruction-following capabilities. The resulting model has an incredible performance considering its size.
 ## Evaluation
+Found in the Open LLM Leaderboard.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__LocutusqueXFelladrin-TinyMistral248M-Instruct)
+|             Metric              |Value|
+|---------------------------------|----:|
+|Avg.                             |27.98|
+|AI2 Reasoning Challenge (25-Shot)|24.74|
+|HellaSwag (10-Shot)              |27.79|
+|MMLU (5-Shot)                    |26.12|
+|TruthfulQA (0-shot)              |40.12|
+|Winogrande (5-shot)              |49.09|
+|GSM8k (5-shot)                   | 0.00|