Update README.md
Browse files
README.md
CHANGED
|
@@ -29,4 +29,19 @@ widget:
|
|
| 29 |
|
| 30 |
# 🌌 Magistaroth 24B v1.1
|
| 31 |
|
| 32 |
-

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
# 🌌 Magistaroth 24B v1.1
|
| 31 |
|
| 32 |
+

|
| 33 |
+
|
| 34 |
+
# Merge Method
|
| 35 |
+
A custom merge method known as `pdq` has been invented. Instead of using its own yaml, it acts as a post-merge processor which applies directly to the merged model using the [original yaml](https://huggingface.co/DarkArtsForge/Magistaroth-24B-v1/raw/main/mergekit_config.yml). `pdq` aims to enhance creativity by re-scanning the original donor models, encouraging them to explore the 'dark matter' regions of the vectors to synergistically augment the merged base with more unique novelty. For **Magistaroth v1.1**, I tested both the v1 `Della → PDQ → MPOA` and `Della → MPOA → PDQ`.
|
| 36 |
+
|
| 37 |
+
It turns out that both are very creative, and the `MPOA → PDQ` is interesting because it doesn't re-introduce any refusals, however, `PDQ → MPOA` is much smarter. The difference in Q0 bench reflects this (9451 vs 12648). `Scale 1.2` was the ablation threshold required to disable refusals. This has resulted in the most creative, detailed, and uncensored variant of the configurations tested.
|
| 38 |
+
|
| 39 |
+
### Bugs
|
| 40 |
+
A small risk of increased artifacts (missing spaces, word misspelled or repeated) might be noticed due to `pdq` pushing the limits of what's possible with transformers. These are rare and can be edited out if needed.
|
| 41 |
+
|
| 42 |
+
### Fully Uncensored
|
| 43 |
+
An unablated PDQ version was also tested (it has refusals) but it seems the ablated versions are more popular so I'm just releasing this one for now.
|
| 44 |
+
|
| 45 |
+
### Settings
|
| 46 |
+
- Recommended `temp 1.0` and `topnsigma 1.25`
|
| 47 |
+
- `Mistral Tekken` chat template
|