matchaaaaa
/

Chaifighter-Latte-14B

@@ -6,10 +6,35 @@ tags:
 - merge
 ---
 # Chaifighter Latte 14B
 ## The Deets
 ### Mergekit
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
@@ -20,33 +45,114 @@ This model was merged using the passthrough merge method.
 ### Models Merged
-* [SanjiWatsuki/Kunoichi-7B]()
-* [Sao10K/Fimbulvetr-11B-v2]()
-* [Sao10K/Frostwind-v2.1-m7]()
-* [Gryphe/MythoMist-7b]()
-### Configuration
 The following YAML configuration was used to produce this model:
 ```yaml
 dtype: float32
 merge_method: passthrough
 slices:
-- sources:
-  - layer_range: [0, 16]
-    model: D:/MLnonsense/models/SanjiWatsuki_Kunoichi-7B
-- sources:
-  - layer_range: [0, 8]
-    model: mega\Kuno-Fimbul-splice
-- sources:
-  - layer_range: [16, 32]
-    model: D:/MLnonsense/models/Sao10K_Fimbulvetr-11B-v2
-- sources:
-  - layer_range: [0, 8]
-    model: mega\Fimbul-Frosty-Mytho-splice
-- sources:
-  - layer_range: [16, 32]
-    model: mega\Frosty-Mytho
 ```

 - merge
 ---
+![cute](https://huggingface.co/matchaaaaa/Chaifighter-Latte- 14B/resolve/main/chaifighter-latte-cute.png)
 # Chaifighter Latte 14B
+Finally here, Chaifighter Latte is the successor to the Chaifighter 20B models. Like its predecessors, it is Mistral-based, but now it is dramatically reduced in size. Chaifighter Latte is formulated for creative, rich, verbose writing without sacrificing intelligence, awareness, and context-following abilities. Chaifighter Latte retains the great taste of the original, and despite being significantly lighter at 14 billion parameters, it performs even better. Try it for yourself!
+## Prompt Template: Alpaca
+```
+Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction:
+{prompt}
+### Response:
+```
+## Recommended Settings: Universal-Light
+Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!
+* Temperature:        **1.0** *to* **1.25** (adjust to taste, but keep it low. Chaifighter is creative enough on its own)
+* Min-P:              **0.1** (increasing might help if it goes cuckoo, but I suggest keeping it there)
+* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
+* Rep. Penalty Range: **256** *or* **512**
+* *(all other samplers disabled)*
 ## The Deets
 ### Mergekit
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ### Models Merged
+* [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
+* [Sao10K/Fimbulvetr-11B-v2](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2)
+* [Sao10K/Frostwind-v2.1-m7](https://huggingface.co/Sao10K/Frostwind-v2.1-m7)
+* [Gryphe/MythoMist-7b](https://huggingface.co/Gryphe/MythoMist-7b)
+### The Sauce
 The following YAML configuration was used to produce this model:
 ```yaml
+slices:
+  - sources:
+    - model: SanjiWatsuki/Kunoichi-7B
+      layer_range: [16, 24]
+merge_method: passthrough
+dtype: float32
+name: Kuno-splice
+---
+slices:
+  - sources:
+    - model: Sao10K/Fimbulvetr-11B-v2
+      layer_range: [8, 16]
+merge_method: passthrough
+dtype: float32
+name: Fimbul-splice
+---
+models:
+  - model: Kuno-splice
+    parameters:
+      weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy"
+  - model: Fimbul-splice
+    parameters:
+      weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy"
+merge_method: dare_linear # according to some paper, "DARE is all you need"
+base_model: Kuno-splice
 dtype: float32
+name: Kuno-Fimbul-splice
+---
+models:
+  - model: Sao10K/Frostwind-v2.1-m7
+  - model: Gryphe/MythoMist-7b
+    parameters:
+      weight: 0.37
+      density: 0.8
+merge_method: dare_ties
+base_model: Sao10K/Frostwind-v2.1-m7
+dtype: float32
+name: Frosty-Mytho
+---
+slices:
+  - sources:
+    - model: Sao10K/Fimbulvetr-11B-v2
+      layer_range: [32, 40]
 merge_method: passthrough
+dtype: float32
+name: Fimbul-splice-2
+---
 slices:
+  - sources:
+    - model: Frosty-Mytho
+      layer_range: [8, 16]
+merge_method: passthrough
+dtype: float32
+name: Frosty-Mytho-splice
+---
+models:
+  - model: Fimbul-splice-2
+    parameters:
+      weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy"
+  - model: Frosty-Mytho-splice
+    parameters:
+      weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy"
+merge_method: dare_linear # according to some paper, "DARE is all you need"
+base_model: Fimbul-splice-2
+dtype: float32
+name: Fimbul-Frosty-Mytho-splice
+---
+slices:
+  - sources: # kunoichi
+    - model: SanjiWatsuki/Kunoichi-7B
+      layer_range: [0, 16]
+  - sources: # kunoichi gradient fimbul splice
+    - model: Kuno-Fimbul-splice
+      layer_range: [0, 8]
+  - sources: # fimbulvetr
+    - model: Sao10K/Fimbulvetr-11B-v2
+      layer_range: [16, 32]
+      # insert splice here
+  - sources: # fimbulvetr gradient fwmm splice
+    - model: Fimbul-Frosty-Mytho-splice
+      layer_range: [0, 8]
+  - sources: # frostwind + mythomist
+    - model: Frosty-Mytho
+      layer_range: [16, 32]
+merge_method: passthrough
+dtype: float32
+name: Chaifighter-Latte-14B
 ```
+### The Thought Process
+So, I wanted the first layers to be Kunoichi. Kunoichi was chosen for its strong context and instruct following abilities, as well as being a really smart model overall. Plus, it's not sloutch at RP. I think this is partly what gave previous Chaifighter models the awareness that many people liked. To best harness its stellar prompt processing performance, I put Kunoichi at the head of the stack.
+Next, I applied a gradient merge that I call a "splice". Splicing models like this solves what I believe has significantly hurt the earlier Chaifighter models and many other frankenmerges, which is layer dissimilarity. Splicing the end of one stack from model A with the beginning of another stack of model B in theory helps smoothen over those differences and help bring everything together.
+The second model I introduced is Fimbulvetr-v2. This should be no surprise, as it's also a well-established ingredient of the Chaifighter recipe. Boasting incredibly strong coherence, it is the glue that can hold a story together, even with multiple characters and over longer contexts. I felt like the best place for Fimbulvetr was right after Kunoichi.
+Another splice.
+Lastly, I picked Frostwind and MythoMist as the final layers in this merge. I wanted to introduce MythoMist into the merge as I felt like it was what gave Chaifighter its flavorful writing. I paired it with Frostwind, as it's a very creative writer as well, and I felt like the two (with more emphasis on Frostwind for consistency) produced high quality outputs up to my standards.
+Thanks for looking at my model, and have a fantastic day! :)