DavidAU
/

Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

@@ -78,7 +78,7 @@ You will get higher quality operation overall - stronger prose, better answers,
 ---
-PARAMETERS AND SAMPLERS
 ---
@@ -106,9 +106,11 @@ Everything else is "zeroed" / "disabled".
 These parameters/settings are considered both safe and default and in most cases available to all users in all AI/LLM apps.
 ---
-<B>SOURCE FILES for my Models:</B>
 ---
@@ -157,6 +159,19 @@ For CLASS3 and CLASS4 the most important setting is "SMOOTHING FACTOR" (Quadrati
 https://docs.sillytavern.app/usage/common-settings/
 NOTE:
 It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
@@ -170,7 +185,7 @@ OTHER PROGRAMS:
 Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
-In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Sillytavern", "Olama", "Backyard", and "LMStudio" (as well as other apps too).
 You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
@@ -182,16 +197,28 @@ Special note:
 It appears "DRY" / "XTC" samplers has been added to LLAMACPP and SILLYTAVERN.
-It is available via "llama-server.exe". Likely this sampler will also become available "downstream" in applications that use LLAMACPP in due time.
 [ https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md ]
 ---
-DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS:
 ---
 For additional details on these samplers settings (including advanced ones) you may also want to check out:
 https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
@@ -219,9 +246,11 @@ LLAMACPP-SERVER EXE:
 https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
 ---
-CRITICAL NOTES:
 ---
@@ -243,12 +272,20 @@ For reference here are some Class 3/4 models:
 [ https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF ]
 [  https://huggingface.co/DavidAU/L3-DARKEST-PLANET-16.5B-GGUF ]
 [ https://huggingface.co/DavidAU/MN-DARKEST-UNIVERSE-29B-GGUF ]
 [ https://huggingface.co/DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-23.5B-GGUF ]
 Although Class 3 and Class 4 models will work when used within their specific use case(s), standard parameters and settings on the model card, I recognize that users want either a smoother experience
 and/or want to use these models for other than intended use case(s) and that is in part why I created this document.
@@ -264,7 +301,7 @@ There are no "Class 5" models published... yet.
 ---
-QUANTS:
 ---
@@ -351,7 +388,7 @@ Suggestions for Imatrix NEO quants:
 ---
-Quick Reference Table
 ---
@@ -361,6 +398,8 @@ https://huggingface.co/EnragedAntelope
 https://github.com/EnragedAntelope
 Please see sections below this for advanced usage, more details, settings, notes etc etc.
 <small>
@@ -383,7 +422,7 @@ Please see sections below this for advanced usage, more details, settings, notes
 | **Penalty Samplers** |
-| repeat-last-n 		| Number of tokens to consider for penalties. Critical for preventing repetition. Default: 64 |
 | repeat-penalty	 	| Penalizes repeated token sequences. Range: 1.0-1.15. Default: 1.0 |
@@ -440,7 +479,7 @@ Please see sections below this for advanced usage, more details, settings, notes
 ---
-HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
 ---
@@ -487,7 +526,7 @@ However sometimes parameters and/or samplers are required to better "wrangle" th
 ---
-Section 1a : PRIMARY PARAMETERS - ALL APPS:
 ---
@@ -562,7 +601,7 @@ Then test "at temp" with your prompt(s) to see the MODELS in action. (5-10 gener
 ---
-Section 1b : PENALITY SAMPLERS - ALL APPS:
 ---
@@ -650,13 +689,13 @@ For "text-gen-webui" and "Koboldcpp" these are directly accessible.
 Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used.  (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
-<B>mirostat-lr</B>
 Mirostat learning rate, parameter eta (default: 0.1)  " mirostat_tau "
 mirostat_tau: 5-8 is a good value.
-<B>mirostat-ent</B>
 Mirostat target entropy, parameter tau (default: 5.0) " mirostat_eta "
@@ -676,12 +715,13 @@ CLASS 3: models it is suggested to use this to assist with generation (min setti
 CLASS 4: models it is highly recommended with Microstat 1 or 2 + mirostat_tau @ 6 to 8 and mirostat_eta at .1 to .5
-<B>dynatemp-range</B>
 dynamic temperature range (default: 0.0, 0.0 = disabled)
-<B>dynatemp-exp</B>
 dynamic temperature exponent (default: 1.0)
@@ -721,13 +761,15 @@ Locally typical sampling, parameter p (default: 1.0, 1.0 = disabled)
 If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
-<B>xtc-probability</B>
 xtc probability (default: 0.0, 0.0 = disabled)
 Probability that the removal will actually happen. 0 disables the sampler. 1 makes it always happen.
-<B>xtc-threshold</B>
 xtc threshold (default: 0.1, 1.0 = disabled)
@@ -756,9 +798,9 @@ Careful testing is required, as this can have unclear side effects.
 ---
-SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP":
-Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".
 ---
@@ -776,7 +818,7 @@ Most of these are also available in KOBOLDCPP too (via settings -> samplers) aft
 I am not going to touch on all of samplers / parameters, just the main ones at the moment.
-However, you should also check / test operation of:
 a] Affects per token generation:
@@ -864,7 +906,7 @@ https://www.reddit.com/r/KoboldAI/comments/1e49vpt/dry_sampler_questionsthat_im_
 https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
-<B>QUADRATIC SAMPLING:</B>
 This sampler alters the "score" of ALL TOKENS at the time of generation and as a result affects the entire generation of the model. See "Additional Links" links section above for more information.

 ---
+<h2>TESTING / Generation Example PARAMETERS AND SAMPLERS</h2>
 ---
 These parameters/settings are considered both safe and default and in most cases available to all users in all AI/LLM apps.
+Note for Class 3/Class 4 models (discussed below) "repeat-last-n" is a CRITICAL setting.
 ---
+<h2>SOURCE FILES for my Models / APPS to Run LLMs / AIs:</h2>
 ---
 https://docs.sillytavern.app/usage/common-settings/
+Critical Note:
+Silly Tavern allows you to "connect" (via API) to different AI programs/apps like Koboldcpp, Llamacpp (server), Text Generation Webui, Lmstudio, Ollama ... etc etc.
+You "load" a model in one of these, then connect Silly Tavern to the App via API. This way you can use any model, and Sillytavern becomes the interface between
+the AI model and you directly. Sillytavern opens an interface in your browser.
+In Sillytavern you can then adjust parameters, samplers and advanced samplers ; there are also PRESET parameter/samplers too and you can save your favorites too.
+Currently, at time of this writing, connecting Silly Tavern via KoboldCPP or Text Generation Webui will provide the most samplers/parameters.
+However for some, connecting to Lmstudio, LlamaCPP, or Ollama may be preferred.
 NOTE:
 It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
 Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
+In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Sillytavern", "Olama", and "LMStudio" (as well as other apps too).
 You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
 It appears "DRY" / "XTC" samplers has been added to LLAMACPP and SILLYTAVERN.
+It is available (Llamacpp) via "llama-server.exe". Likely this sampler will also become available "downstream" in applications that use LLAMACPP in due time.
 [ https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md ]
+Operating Systems:
+Most AI/LLM apps operate on Windows, Mac, and Linux.
 ---
+<h2>DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS:</h2>
 ---
+Most AI / LLM apps allow saving a "profile" parameters and samplers - "favorite" settings.
+Text Generation Web Ui, Koboldcpp, Silly Tavern all have this feature and also "presets" (parameters/samplers set already) too.
+Other AI/LLM apps also have this feature to varying degrees too.
+DETAILS on PARAMETERS / SAMPLERS:
 For additional details on these samplers settings (including advanced ones) you may also want to check out:
 https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab
 https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md
+I have also added notes too in the sections below as well.
 ---
+<h2>Class 1, 2, 3 and 4 model critical notes:</h2>
 ---
 [ https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF ]
+(note Grand Horror Series contain class 2,3 and 4 models)
 [  https://huggingface.co/DavidAU/L3-DARKEST-PLANET-16.5B-GGUF ]
+(note Dark Planet Series contains Class 1, 2 and Class 3/4 models)
 [ https://huggingface.co/DavidAU/MN-DARKEST-UNIVERSE-29B-GGUF ]
+(this model has exceptional prose abilities in all areas)
 [ https://huggingface.co/DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-23.5B-GGUF ]
+(note Grand Guttenberg Madness/Darkess (12B) are class 1 models, but compressed versions of 23.5B)
 Although Class 3 and Class 4 models will work when used within their specific use case(s), standard parameters and settings on the model card, I recognize that users want either a smoother experience
 and/or want to use these models for other than intended use case(s) and that is in part why I created this document.
 ---
+<h2>QUANTS:</h2>
 ---
 ---
+<h2> Quick Reference Table </h2>
 ---
 https://github.com/EnragedAntelope
+This section will get you started - especially with class 3 and 4 models - and the detail section will cover settings / control in more depth below.
 Please see sections below this for advanced usage, more details, settings, notes etc etc.
 <small>
 | **Penalty Samplers** |
+| repeat-last-n 		| Number of tokens to consider for penalties. Critical for preventing repetition. Default: 64 (Class 3/4 - but see notes) |
 | repeat-penalty	 	| Penalizes repeated token sequences. Range: 1.0-1.15. Default: 1.0 |
 ---
+<h2>ADVANCED: HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)</h2>
 ---
 ---
+<h2>Section 1a : PRIMARY PARAMETERS - ALL APPS:</h2>
 ---
 ---
+<h2>Section 1b : PENALITY SAMPLERS - ALL APPS:</h2>
 ---
 Use Mirostat sampling. "Top K", "Nucleus", "Tail Free" (TFS) and "Locally Typical" (TYPICAL) samplers are ignored if used.  (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
+mirostat-lr
 Mirostat learning rate, parameter eta (default: 0.1)  " mirostat_tau "
 mirostat_tau: 5-8 is a good value.
+mirostat-ent
 Mirostat target entropy, parameter tau (default: 5.0) " mirostat_eta "
 CLASS 4: models it is highly recommended with Microstat 1 or 2 + mirostat_tau @ 6 to 8 and mirostat_eta at .1 to .5
+<b>Dynamic Temperature</b>
+dynatemp-range
 dynamic temperature range (default: 0.0, 0.0 = disabled)
+dynatemp-exp
 dynamic temperature exponent (default: 1.0)
 If not set to 1, select only tokens that are at least this much more likely to appear than random tokens, given the prior text.
+<B> XTC</B>
+xtc-probability
 xtc probability (default: 0.0, 0.0 = disabled)
 Probability that the removal will actually happen. 0 disables the sampler. 1 makes it always happen.
+xtc-threshold
 xtc threshold (default: 0.1, 1.0 = disabled)
 ---
+<h2>SECTION 2: ADVANCED SAMPLERS - "text-generation-webui" / "KOBOLDCPP": </h2>
+<B>Additional Parameters / Samplers, including "DRY", "QUADRATIC" and "ANTI-SLOP".</B>
 ---
 I am not going to touch on all of samplers / parameters, just the main ones at the moment.
+However, you should also check / test operation of (these are in Text Generation WebUI, and may be available via API / In Sillytavern (when connected to Text Generation Webui)):
 a] Affects per token generation:
 https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
+<B>QUADRATIC SAMPLING: AKA "Smoothing"</B>
 This sampler alters the "score" of ALL TOKENS at the time of generation and as a result affects the entire generation of the model. See "Additional Links" links section above for more information.