parameters guide
samplers guide
model generation
role play settings
quant selection
arm quants
iq quants vs q quants
optimal model setting
gibberish fixes
coherence
instructing following
quality generation
chat settings
quality settings
llamacpp server
llamacpp
lmstudio
sillytavern
koboldcpp
backyard
ollama
model generation steering
steering
model generation fixes
text generation webui
ggufs
exl2
full precision
quants
imatrix
neo imatrix
llama
llama-3
gemma
gemma2
gemma3
llama-2
llama-3.1
llama-3.2
mistral
Mixture of Experts
mixture of experts
mixtral
Update README.md
Browse files
README.md
CHANGED
|
@@ -78,7 +78,7 @@ You will get higher quality operation overall - stronger prose, better answers,
|
|
| 78 |
|
| 79 |
---
|
| 80 |
|
| 81 |
-
PARAMETERS AND SAMPLERS
|
| 82 |
|
| 83 |
---
|
| 84 |
|
|
@@ -106,9 +106,11 @@ Everything else is "zeroed" / "disabled".
|
|
| 106 |
|
| 107 |
These parameters/settings are considered both safe and default and in most cases available to all users in all AI/LLM apps.
|
| 108 |
|
|
|
|
|
|
|
| 109 |
---
|
| 110 |
|
| 111 |
-
<
|
| 112 |
|
| 113 |
---
|
| 114 |
|
|
@@ -157,6 +159,19 @@ For CLASS3 and CLASS4 the most important setting is "SMOOTHING FACTOR" (Quadrati
|
|
| 157 |
|
| 158 |
https://docs.sillytavern.app/usage/common-settings/
|
| 159 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 160 |
NOTE:
|
| 161 |
|
| 162 |
It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
|
|
@@ -170,7 +185,7 @@ OTHER PROGRAMS:
|
|
| 170 |
|
| 171 |
Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
|
| 172 |
|
| 173 |
-
In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Sillytavern", "Olama",
|
| 174 |
|
| 175 |
You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
|
| 176 |
|
|
@@ -182,16 +197,28 @@ Special note:
|
|
| 182 |
|
| 183 |
It appears "DRY" / "XTC" samplers has been added to LLAMACPP and SILLYTAVERN.
|
| 184 |
|
| 185 |
-
It is available via "llama-server.exe". Likely this sampler will also become available "downstream" in applications that use LLAMACPP in due time.
|
| 186 |
|
| 187 |
[ https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md ]
|
| 188 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 189 |
---
|
| 190 |
|
| 191 |
-
DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS
|
| 192 |
|
| 193 |
---
|
| 194 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 195 |
For additional details on these samplers settings (including advanced ones) you may also want to check out:
|
| 196 |
|
| 197 |
https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
|
|
@@ -219,9 +246,11 @@ LLAMACPP-SERVER EXE:
|
|
| 219 |
|
| 220 |
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
|
| 221 |
|
|
|
|
|
|
|
| 222 |
---
|
| 223 |
|
| 224 |
-
|
| 225 |
|
| 226 |
---
|
| 227 |
|
|
@@ -243,12 +272,20 @@ For reference here are some Class 3/4 models:
|
|
| 243 |
|
| 244 |
[ https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF ]
|
| 245 |
|
|
|
|
|
|
|
| 246 |
[ https://huggingface.co/DavidAU/L3-DARKEST-PLANET-16.5B-GGUF ]
|
| 247 |
|
|
|
|
|
|
|
| 248 |
[ https://huggingface.co/DavidAU/MN-DARKEST-UNIVERSE-29B-GGUF ]
|
| 249 |
|
|
|
|
|
|
|
| 250 |
[ https://huggingface.co/DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-23.5B-GGUF ]
|
| 251 |
|
|
|
|
|
|
|
| 252 |
Although Class 3 and Class 4 models will work when used within their specific use case(s), standard parameters and settings on the model card, I recognize that users want either a smoother experience
|
| 253 |
and/or want to use these models for other than intended use case(s) and that is in part why I created this document.
|
| 254 |
|
|
@@ -264,7 +301,7 @@ There are no "Class 5" models published... yet.
|
|
| 264 |
|
| 265 |
---
|
| 266 |
|
| 267 |
-
QUANTS
|
| 268 |
|
| 269 |
---
|
| 270 |
|
|
@@ -351,7 +388,7 @@ Suggestions for Imatrix NEO quants:
|
|
| 351 |
|
| 352 |
---
|
| 353 |
|
| 354 |
-
Quick Reference Table
|
| 355 |
|
| 356 |
---
|
| 357 |
|
|
@@ -361,6 +398,8 @@ https://huggingface.co/EnragedAntelope
|
|
| 361 |
|
| 362 |
https://github.com/EnragedAntelope
|
| 363 |
|
|
|
|
|
|
|
| 364 |
Please see sections below this for advanced usage, more details, settings, notes etc etc.
|
| 365 |
|
| 366 |
<small>
|
|
@@ -383,7 +422,7 @@ Please see sections below this for advanced usage, more details, settings, notes
|
|
| 383 |
|
| 384 |
| **Penalty Samplers** |
|
| 385 |
|
| 386 |
-
| repeat-last-n | Number of tokens to consider for penalties. Critical for preventing repetition. Default: 64 |
|
| 387 |
|
| 388 |
| repeat-penalty | Penalizes repeated token sequences. Range: 1.0-1.15. Default: 1.0 |
|
| 389 |
|
|
@@ -440,7 +479,7 @@ Please see sections below this for advanced usage, more details, settings, notes
|
|
| 440 |
|
| 441 |
---
|
| 442 |
|
| 443 |
-
HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
|
| 444 |
|
| 445 |
---
|
| 446 |
|
|
@@ -487,7 +526,7 @@ However sometimes parameters and/or samplers are required to better "wrangle" th
|
|
| 487 |
|
| 488 |
---
|
| 489 |
|
| 490 |
-
Section 1a : PRIMARY PARAMETERS - ALL APPS
|
| 491 |
|
| 492 |
---
|
| 493 |
|
|
@@ -562,7 +601,7 @@ Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 gener
|
|
| 562 |
|
| 563 |
---
|
| 564 |
|
| 565 |
-
Section 1b : PENALITY SAMPLERS - ALL APPS
|
| 566 |
|
| 567 |
---
|
| 568 |
|
|
@@ -650,13 +689,13 @@ For "text-gen-webui" and "Koboldcpp" these are directly accessible.
|
|
| 650 |
|
| 651 |
Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
|
| 652 |
|
| 653 |
-
|
| 654 |
|
| 655 |
Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
|
| 656 |
|
| 657 |
mirostat_tau: 5-8 is a good value.
|
| 658 |
|
| 659 |
-
|
| 660 |
|
| 661 |
Mirostat target entropy, parameter tau (default: 5.0) " mirostat_eta "
|
| 662 |
|
|
@@ -676,12 +715,13 @@ CLASS 3: models it is suggested to use this to assist with generation (min setti
|
|
| 676 |
|
| 677 |
CLASS 4: models it is highly recommended with Microstat 1 or 2 + mirostat_tau @ 6 to 8 and mirostat_eta at .1 to .5
|
| 678 |
|
|
|
|
| 679 |
|
| 680 |
-
|
| 681 |
|
| 682 |
dynamic temperature range (default: 0.0, 0.0 = disabled)
|
| 683 |
|
| 684 |
-
|
| 685 |
|
| 686 |
dynamic temperature exponent (default: 1.0)
|
| 687 |
|
|
@@ -721,13 +761,15 @@ Locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
|
|
| 721 |
|
| 722 |
If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
|
| 723 |
|
| 724 |
-
<B>
|
|
|
|
|
|
|
| 725 |
|
| 726 |
xtc probability (default: 0.0, 0.0 = disabled)
|
| 727 |
|
| 728 |
Probability that the removal will actually happen. 0 disables the sampler. 1 makes it always happen.
|
| 729 |
|
| 730 |
-
|
| 731 |
|
| 732 |
xtc threshold (default: 0.1, 1.0 = disabled)
|
| 733 |
|
|
@@ -756,9 +798,9 @@ Careful testing is required, as this can have unclear side effects.
|
|
| 756 |
|
| 757 |
---
|
| 758 |
|
| 759 |
-
SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
|
| 760 |
|
| 761 |
-
Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP"
|
| 762 |
|
| 763 |
---
|
| 764 |
|
|
@@ -776,7 +818,7 @@ Most of these are also available in KOBOLDCPP too (via settings -> samplers) aft
|
|
| 776 |
|
| 777 |
I am not going to touch on all of samplers / parameters, just the main ones at the moment.
|
| 778 |
|
| 779 |
-
However, you should also check / test operation of:
|
| 780 |
|
| 781 |
a] Affects per token generation:
|
| 782 |
|
|
@@ -864,7 +906,7 @@ https://www.reddit.com/r/KoboldAI/comments/1e49vpt/dry_sampler_questionsthat_im_
|
|
| 864 |
https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
|
| 865 |
|
| 866 |
|
| 867 |
-
<B>QUADRATIC SAMPLING
|
| 868 |
|
| 869 |
This sampler alters the "score" of ALL TOKENS at the time of generation and as a result affects the entire generation of the model. See "Additional Links" links section above for more information.
|
| 870 |
|
|
|
|
| 78 |
|
| 79 |
---
|
| 80 |
|
| 81 |
+
<h2>TESTING / Generation Example PARAMETERS AND SAMPLERS</h2>
|
| 82 |
|
| 83 |
---
|
| 84 |
|
|
|
|
| 106 |
|
| 107 |
These parameters/settings are considered both safe and default and in most cases available to all users in all AI/LLM apps.
|
| 108 |
|
| 109 |
+
Note for Class 3/Class 4 models (discussed below) "repeat-last-n" is a CRITICAL setting.
|
| 110 |
+
|
| 111 |
---
|
| 112 |
|
| 113 |
+
<h2>SOURCE FILES for my Models / APPS to Run LLMs / AIs:</h2>
|
| 114 |
|
| 115 |
---
|
| 116 |
|
|
|
|
| 159 |
|
| 160 |
https://docs.sillytavern.app/usage/common-settings/
|
| 161 |
|
| 162 |
+
Critical Note:
|
| 163 |
+
|
| 164 |
+
Silly Tavern allows you to "connect" (via API) to different AI programs/apps like Koboldcpp, Llamacpp (server), Text Generation Webui, Lmstudio, Ollama ... etc etc.
|
| 165 |
+
|
| 166 |
+
You "load" a model in one of these, then connect Silly Tavern to the App via API. This way you can use any model, and Sillytavern becomes the interface between
|
| 167 |
+
the AI model and you directly. Sillytavern opens an interface in your browser.
|
| 168 |
+
|
| 169 |
+
In Sillytavern you can then adjust parameters, samplers and advanced samplers ; there are also PRESET parameter/samplers too and you can save your favorites too.
|
| 170 |
+
|
| 171 |
+
Currently, at time of this writing, connecting Silly Tavern via KoboldCPP or Text Generation Webui will provide the most samplers/parameters.
|
| 172 |
+
|
| 173 |
+
However for some, connecting to Lmstudio, LlamaCPP, or Ollama may be preferred.
|
| 174 |
+
|
| 175 |
NOTE:
|
| 176 |
|
| 177 |
It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
|
|
|
|
| 185 |
|
| 186 |
Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
|
| 187 |
|
| 188 |
+
In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Sillytavern", "Olama", and "LMStudio" (as well as other apps too).
|
| 189 |
|
| 190 |
You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
|
| 191 |
|
|
|
|
| 197 |
|
| 198 |
It appears "DRY" / "XTC" samplers has been added to LLAMACPP and SILLYTAVERN.
|
| 199 |
|
| 200 |
+
It is available (Llamacpp) via "llama-server.exe". Likely this sampler will also become available "downstream" in applications that use LLAMACPP in due time.
|
| 201 |
|
| 202 |
[ https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md ]
|
| 203 |
|
| 204 |
+
Operating Systems:
|
| 205 |
+
|
| 206 |
+
Most AI/LLM apps operate on Windows, Mac, and Linux.
|
| 207 |
+
|
| 208 |
---
|
| 209 |
|
| 210 |
+
<h2>DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS:</h2>
|
| 211 |
|
| 212 |
---
|
| 213 |
|
| 214 |
+
Most AI / LLM apps allow saving a "profile" parameters and samplers - "favorite" settings.
|
| 215 |
+
|
| 216 |
+
Text Generation Web Ui, Koboldcpp, Silly Tavern all have this feature and also "presets" (parameters/samplers set already) too.
|
| 217 |
+
|
| 218 |
+
Other AI/LLM apps also have this feature to varying degrees too.
|
| 219 |
+
|
| 220 |
+
DETAILS on PARAMETERS / SAMPLERS:
|
| 221 |
+
|
| 222 |
For additional details on these samplers settings (including advanced ones) you may also want to check out:
|
| 223 |
|
| 224 |
https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
|
|
|
|
| 246 |
|
| 247 |
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
|
| 248 |
|
| 249 |
+
I have also added notes too in the sections below as well.
|
| 250 |
+
|
| 251 |
---
|
| 252 |
|
| 253 |
+
<h2>Class 1, 2, 3 and 4 model critical notes:</h2>
|
| 254 |
|
| 255 |
---
|
| 256 |
|
|
|
|
| 272 |
|
| 273 |
[ https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF ]
|
| 274 |
|
| 275 |
+
(note Grand Horror Series contain class 2,3 and 4 models)
|
| 276 |
+
|
| 277 |
[ https://huggingface.co/DavidAU/L3-DARKEST-PLANET-16.5B-GGUF ]
|
| 278 |
|
| 279 |
+
(note Dark Planet Series contains Class 1, 2 and Class 3/4 models)
|
| 280 |
+
|
| 281 |
[ https://huggingface.co/DavidAU/MN-DARKEST-UNIVERSE-29B-GGUF ]
|
| 282 |
|
| 283 |
+
(this model has exceptional prose abilities in all areas)
|
| 284 |
+
|
| 285 |
[ https://huggingface.co/DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-23.5B-GGUF ]
|
| 286 |
|
| 287 |
+
(note Grand Guttenberg Madness/Darkess (12B) are class 1 models, but compressed versions of 23.5B)
|
| 288 |
+
|
| 289 |
Although Class 3 and Class 4 models will work when used within their specific use case(s), standard parameters and settings on the model card, I recognize that users want either a smoother experience
|
| 290 |
and/or want to use these models for other than intended use case(s) and that is in part why I created this document.
|
| 291 |
|
|
|
|
| 301 |
|
| 302 |
---
|
| 303 |
|
| 304 |
+
<h2>QUANTS:</h2>
|
| 305 |
|
| 306 |
---
|
| 307 |
|
|
|
|
| 388 |
|
| 389 |
---
|
| 390 |
|
| 391 |
+
<h2> Quick Reference Table </h2>
|
| 392 |
|
| 393 |
---
|
| 394 |
|
|
|
|
| 398 |
|
| 399 |
https://github.com/EnragedAntelope
|
| 400 |
|
| 401 |
+
This section will get you started - especially with class 3 and 4 models - and the detail section will cover settings / control in more depth below.
|
| 402 |
+
|
| 403 |
Please see sections below this for advanced usage, more details, settings, notes etc etc.
|
| 404 |
|
| 405 |
<small>
|
|
|
|
| 422 |
|
| 423 |
| **Penalty Samplers** |
|
| 424 |
|
| 425 |
+
| repeat-last-n | Number of tokens to consider for penalties. Critical for preventing repetition. Default: 64 (Class 3/4 - but see notes) |
|
| 426 |
|
| 427 |
| repeat-penalty | Penalizes repeated token sequences. Range: 1.0-1.15. Default: 1.0 |
|
| 428 |
|
|
|
|
| 479 |
|
| 480 |
---
|
| 481 |
|
| 482 |
+
<h2>ADVANCED: HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)</h2>
|
| 483 |
|
| 484 |
---
|
| 485 |
|
|
|
|
| 526 |
|
| 527 |
---
|
| 528 |
|
| 529 |
+
<h2>Section 1a : PRIMARY PARAMETERS - ALL APPS:</h2>
|
| 530 |
|
| 531 |
---
|
| 532 |
|
|
|
|
| 601 |
|
| 602 |
---
|
| 603 |
|
| 604 |
+
<h2>Section 1b : PENALITY SAMPLERS - ALL APPS:</h2>
|
| 605 |
|
| 606 |
---
|
| 607 |
|
|
|
|
| 689 |
|
| 690 |
Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
|
| 691 |
|
| 692 |
+
mirostat-lr
|
| 693 |
|
| 694 |
Mirostat learning rate, parameter eta (default: 0.1) " mirostat_tau "
|
| 695 |
|
| 696 |
mirostat_tau: 5-8 is a good value.
|
| 697 |
|
| 698 |
+
mirostat-ent
|
| 699 |
|
| 700 |
Mirostat target entropy, parameter tau (default: 5.0) " mirostat_eta "
|
| 701 |
|
|
|
|
| 715 |
|
| 716 |
CLASS 4: models it is highly recommended with Microstat 1 or 2 + mirostat_tau @ 6 to 8 and mirostat_eta at .1 to .5
|
| 717 |
|
| 718 |
+
<b>Dynamic Temperature</b>
|
| 719 |
|
| 720 |
+
dynatemp-range
|
| 721 |
|
| 722 |
dynamic temperature range (default: 0.0, 0.0 = disabled)
|
| 723 |
|
| 724 |
+
dynatemp-exp
|
| 725 |
|
| 726 |
dynamic temperature exponent (default: 1.0)
|
| 727 |
|
|
|
|
| 761 |
|
| 762 |
If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
|
| 763 |
|
| 764 |
+
<B> XTC</B>
|
| 765 |
+
|
| 766 |
+
xtc-probability
|
| 767 |
|
| 768 |
xtc probability (default: 0.0, 0.0 = disabled)
|
| 769 |
|
| 770 |
Probability that the removal will actually happen. 0 disables the sampler. 1 makes it always happen.
|
| 771 |
|
| 772 |
+
xtc-threshold
|
| 773 |
|
| 774 |
xtc threshold (default: 0.1, 1.0 = disabled)
|
| 775 |
|
|
|
|
| 798 |
|
| 799 |
---
|
| 800 |
|
| 801 |
+
<h2>SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP": </h2>
|
| 802 |
|
| 803 |
+
<B>Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".</B>
|
| 804 |
|
| 805 |
---
|
| 806 |
|
|
|
|
| 818 |
|
| 819 |
I am not going to touch on all of samplers / parameters, just the main ones at the moment.
|
| 820 |
|
| 821 |
+
However, you should also check / test operation of (these are in Text Generation WebUI, and may be available via API / In Sillytavern (when connected to Text Generation Webui)):
|
| 822 |
|
| 823 |
a] Affects per token generation:
|
| 824 |
|
|
|
|
| 906 |
https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
|
| 907 |
|
| 908 |
|
| 909 |
+
<B>QUADRATIC SAMPLING: AKA "Smoothing"</B>
|
| 910 |
|
| 911 |
This sampler alters the "score" of ALL TOKENS at the time of generation and as a result affects the entire generation of the model. See "Additional Links" links section above for more information.
|
| 912 |
|