Naphula commited on
Commit
2e004f1
·
verified ·
1 Parent(s): 34772c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +208 -2
README.md CHANGED
@@ -77,7 +77,7 @@ widget:
77
 
78
  ![Goetia](https://cdn-uploads.huggingface.co/production/uploads/68e840caa318194c44ec2a04/DHbuh4efzjCGpxDUciZ_-.jpeg)
79
 
80
- The "Della Edition" meant to test bridging 2501 and 2503 models. See [this page](https://huggingface.co/Naphula/Goetia-24B-v1.3/discussions/1) for more info.
81
 
82
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
83
 
@@ -85,4 +85,210 @@ This is a merge of pre-trained language models created using [mergekit](https://
85
  ### Merge Methods
86
 
87
  This model was merged using the following merge method:
88
- - [DELLA](https://arxiv.org/abs/2406.11617)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ![Goetia](https://cdn-uploads.huggingface.co/production/uploads/68e840caa318194c44ec2a04/DHbuh4efzjCGpxDUciZ_-.jpeg)
79
 
80
+ The "Della Edition" meant to test bridging 2501 and 2503 models. See [this post](https://huggingface.co/Naphula/Goetia-24B-v1.3/discussions/1) and [this other post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) for more info.
81
 
82
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
83
 
 
85
  ### Merge Methods
86
 
87
  This model was merged using the following merge method:
88
+ - [DELLA](https://arxiv.org/abs/2406.11617)
89
+
90
+ ```yaml
91
+ architecture: MistralForCausalLM
92
+ models:
93
+ - model: B:\24B\!models--anthracite-core--Mistral-Small-3.2-24B-Instruct-2506-Text-Only
94
+ - model: B:\24B\!models--TheDrummer--Cydonia-24B-v4.3
95
+ parameters:
96
+ density: 0.8
97
+ weight: 0.2
98
+ epsilon: 0.1
99
+ - model: B:\24B\!models--ReadyArt--4.2.0-Broken-Tutu-24b
100
+ parameters:
101
+ density: 0.8
102
+ weight: 0.05
103
+ epsilon: 0.1
104
+ - model: B:\24B\!models--zerofata--MS3.2-PaintedFantasy-v2-24B
105
+ parameters:
106
+ density: 0.8
107
+ weight: 0.2
108
+ epsilon: 0.1
109
+ - model: B:\24B\!models--TheDrummer--Magidonia-24B-v4.3
110
+ parameters:
111
+ density: 0.8
112
+ weight: 0.2
113
+ epsilon: 0.1
114
+ - model: B:\24B\!models--TheDrummer--Precog-24B-v1
115
+ parameters:
116
+ density: 0.8
117
+ weight: 0.2
118
+ epsilon: 0.1
119
+ - model: B:\24B\!models--zerofata--MS3.2-PaintedFantasy-v3-24B
120
+ parameters:
121
+ density: 0.8
122
+ weight: 0.2
123
+ epsilon: 0.1
124
+ - model: B:\24B\!BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly
125
+ parameters:
126
+ density: 0.8
127
+ weight: 0.2
128
+ epsilon: 0.1
129
+ - model: B:\24B\!models--ReadyArt--Broken-Tutu-24B-Transgression-v2.0
130
+ parameters:
131
+ density: 0.8
132
+ weight: 0.05
133
+ epsilon: 0.1
134
+ - model: B:\24B\!models--trashpanda-org--MS3.2-24B-Mullein-v2
135
+ parameters:
136
+ density: 0.8
137
+ weight: 0.2
138
+ epsilon: 0.1
139
+ # - model: B:\24B\!models--LatitudeGames--Hearthfire-24B
140
+ # parameters:
141
+ # density: 0.8
142
+ # weight: 0.1
143
+ # epsilon: 0.1
144
+ - model: B:\24B\!models--TheDrummer--Cydonia-24B-v4.2.0
145
+ parameters:
146
+ density: 0.8
147
+ weight: 0.1
148
+ epsilon: 0.1
149
+ - model: B:\24B\!models--TheDrummer--Magidonia-24B-v4.2.0
150
+ parameters:
151
+ density: 0.8
152
+ weight: 0.1
153
+ epsilon: 0.1
154
+ - model: B:\24B\!models--ConicCat--Mistral-Small-3.2-AntiRep-24B
155
+ parameters:
156
+ density: 0.8
157
+ weight: 0.15
158
+ epsilon: 0.1
159
+ - model: B:\24B\!models--Undi95--MistralThinker-v1.1
160
+ parameters:
161
+ density: 0.8
162
+ weight: 0.02
163
+ epsilon: 0.1
164
+ - model: B:\24B\!models--CrucibleLab--M3.2-24B-Loki-V2
165
+ parameters:
166
+ density: 0.8
167
+ weight: 0.02
168
+ epsilon: 0.1
169
+ - model: B:\24B\!models--Darkhn--M3.2-24B-Animus-V7.1
170
+ parameters:
171
+ density: 0.8
172
+ weight: 0.1
173
+ epsilon: 0.1
174
+ - model: B:\24B\Morax-24B-v1
175
+ parameters:
176
+ density: 0.8
177
+ weight: 0.02
178
+ epsilon: 0.1
179
+ - model: B:\24B\!models--FlareRebellion--WeirdCompound-v1.7-24b
180
+ parameters:
181
+ density: 0.8
182
+ weight: 0.1
183
+ epsilon: 0.1
184
+ # - model: B:\24B\!models--aixonlab--Eurydice-24b-v3.5
185
+ # parameters:
186
+ # density: 0.8
187
+ # weight: 0.08
188
+ # epsilon: 0.1
189
+ - model: B:\24B\!models--allura-forge--ms32-final-TEXTONLY
190
+ parameters:
191
+ density: 0.8
192
+ weight: 0.15
193
+ epsilon: 0.1
194
+ - model: B:\24B\!models--Delta-Vector--Rei-24B-KTO
195
+ parameters:
196
+ density: 0.8
197
+ weight: 0.15
198
+ epsilon: 0.1
199
+ - model: B:\24B\!models--Doctor-Shotgun--MS3.2-24B-Magnum-Diamond
200
+ parameters:
201
+ density: 0.8
202
+ weight: 0.15
203
+ epsilon: 0.1
204
+ - model: B:\24B\!models--ReadyArt--MS3.2-The-Omega-Directive-24B-Unslop-v2.1
205
+ parameters:
206
+ density: 0.8
207
+ weight: 0.15
208
+ epsilon: 0.1
209
+ # - model: B:\24B\!models--Gryphe--Codex-24B-Small-3.2
210
+ # parameters:
211
+ # density: 0.8
212
+ # weight: 0.1
213
+ # epsilon: 0.1
214
+ # - model: B:\24B\!models--CrucibleLab--M3.2-24B-Loki-V1.3
215
+ # parameters:
216
+ # density: 0.8
217
+ # weight: 0.15
218
+ # epsilon: 0.1
219
+ - model: B:\24B\!models--arcee-ai--Arcee-Blitz
220
+ parameters:
221
+ density: 0.8
222
+ weight: 0.02
223
+ epsilon: 0.1
224
+ - model: B:\24B\!models--ArliAI--Mistral-Small-24B-ArliAI-RPMax-v1.4
225
+ parameters:
226
+ density: 0.8
227
+ weight: 0.02
228
+ epsilon: 0.1
229
+ # - model: B:\24B\!models--PocketDoc--Dans-PersonalityEngine-V1.3.0-24b
230
+ # parameters:
231
+ # density: 0.8
232
+ # weight: 0.1
233
+ # epsilon: 0.1
234
+ - model: B:\24B\!models--ReadyArt--Dark-Nexus-24B-v2.0
235
+ parameters:
236
+ density: 0.8
237
+ weight: 0.2
238
+ epsilon: 0.1
239
+ - model: B:\24B\!models--Darkhn--M3.2-24B-Animus-V5.1-Pro
240
+ parameters:
241
+ density: 0.8
242
+ weight: 0.15
243
+ epsilon: 0.1
244
+ - model: B:\24B\!models--dphn--Dolphin-Mistral-24B-Venice-Edition
245
+ parameters:
246
+ density: 0.8
247
+ weight: 0.01
248
+ epsilon: 0.1
249
+ - model: B:\24B\!models--TroyDoesAI--BlackSheep-24B
250
+ parameters:
251
+ density: 0.8
252
+ weight: 0.01
253
+ epsilon: 0.1
254
+ - model: B:\24B\!models--TheDrummer--Cydonia-24B-v2
255
+ parameters:
256
+ density: 0.8
257
+ weight: 0.02
258
+ epsilon: 0.1
259
+ - model: B:\24B\!models--PocketDoc--Dans-DangerousWinds-V1.1.1-24b
260
+ parameters:
261
+ density: 0.8
262
+ weight: 0.02
263
+ epsilon: 0.1
264
+ - model: B:\24B\!models--trashpanda-org--MS-24B-Instruct-Mullein-v0
265
+ parameters:
266
+ density: 0.8
267
+ weight: 0.02
268
+ epsilon: 0.1
269
+ - model: B:\24B\!models--OddTheGreat--Circuitry_24B_V.3
270
+ parameters:
271
+ density: 0.8
272
+ weight: 0.1
273
+ epsilon: 0.1
274
+ - model: B:\24B\!models--spacewars123--Space-Wars-24B-v1.00a
275
+ parameters:
276
+ density: 0.8
277
+ weight: 0.02
278
+ epsilon: 0.1
279
+ # Total Donors: 33
280
+ # Total Weights: 3.3
281
+ # Seed: 420
282
+ merge_method: della
283
+ base_model: B:\24B\!models--anthracite-core--Mistral-Small-3.2-24B-Instruct-2506-Text-Only
284
+ parameters:
285
+ lambda: 1.0
286
+ normalize: true # key variable to test
287
+ int8_mask: false
288
+ dtype: float32
289
+ out_dtype: bfloat16
290
+ tokenizer:
291
+ source: base
292
+ # chat_template: auto
293
+ name: 📜 Goetia-24B-v1.3
294
+ ```