generation
chat settings
sampling
sampler_config
sampling-strategies
parameters guide
samplers guide
decoding
nucleus-sampling
optimized
optimization
experimentation
role play settings
generation-features
optimal model setting
coherence
steering
high_quality
top-k
top-p
temperature
repetition-penalty
sillytavern
koboldcpp
mistral
gemma
llama
gpt2
Major update to README.md
Browse filesAdded V2, V2-Alt and adjusted some descriptions
README.md
CHANGED
|
@@ -30,7 +30,11 @@ tags:
|
|
| 30 |
- llama
|
| 31 |
- gpt2
|
| 32 |
---
|
| 33 |
-
<img src="PERF.png" style="float:right; width:300px; height:300px; padding:10px;">
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
# TruePerfect Sampler Settings
|
| 36 |
|
|
@@ -46,6 +50,8 @@ Any Q5_K_M/Q6(K_M) model, lower will not work correctly and there's no way to fi
|
|
| 46 |
|
| 47 |
Smart Context should be disabled.
|
| 48 |
|
|
|
|
|
|
|
| 49 |
Flash Attention should be disalbed.
|
| 50 |
|
| 51 |
# General and simplified info about specific sampler parameters:
|
|
@@ -99,7 +105,7 @@ Helps increasing the coherence and logic, but only for specific cases.
|
|
| 99 |
I personally don't recommend using it with strict settings, as it will cutoff diversity and choices, which will make text more deterministic and less "exciting".
|
| 100 |
</details>
|
| 101 |
<details><summary>Typical:</summary>
|
| 102 |
-
More "intelligent" cutoff method; might
|
| 103 |
|
| 104 |
Disable and don't use with strict settings, as it will cutoff more (unlikely) speciic details with lower values, which might be important to overall choices and logic of actions.
|
| 105 |
</details>
|
|
@@ -125,7 +131,14 @@ Never managed to get stable enough results for very long period of progression,
|
|
| 125 |
<details><summary>Smoothing Curve (new):</summary>
|
| 126 |
Dynamically adjusts the penalty, temperature, probability to avoid sudden changes; 1 and lower.
|
| 127 |
|
| 128 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
</details>
|
| 130 |
<details><summary>DRY:</summary>
|
| 131 |
Avoid in any case.
|
|
@@ -141,7 +154,27 @@ Probability - the chance for Threshold to cutoff the desired most likely tokens.
|
|
| 141 |
|
| 142 |
# Ready-to-use AIO general settings:
|
| 143 |
<details><summary></summary>
|
| 144 |
-
<
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
To preserve the versatility, I would like to describe complete and specific sampler values below **Additional fine-tuning:**, to aim perfection for any case.
|
| 147 |
|
|
@@ -150,7 +183,7 @@ Here I will describe additional values for special cases, as well as dependencie
|
|
| 150 |
<details><summary>Temperature+Top-K (only higher values):</summary>
|
| 151 |
Temperature is related to Top-K, and in order to achieve perfect part for this specific parameters, both of these needs to be adjusted.
|
| 152 |
|
| 153 |
-
For example Temperature 2.4 **needs** to have Top-K
|
| 154 |
|
| 155 |
**Acceptable values of Top-K for Temperature 2.4**: 134, 206, 278.
|
| 156 |
|
|
@@ -158,30 +191,30 @@ Different values will cause inconsistency, instability and other issues.
|
|
| 158 |
|
| 159 |
**Acceptable values of Temperature for Top-K**: 1.2, 2.4, 4.8.
|
| 160 |
|
| 161 |
-
Lower temperature will be
|
| 162 |
|
| 163 |
Higher Top-K will expand the attention to smaller details, and preserve attention to multiple simultaneous events, and also can fix smaller text-related issues (like with quotation marks, asterisks, hyphens and etc.)
|
| 164 |
|
| 165 |
-
**Temperature 2.4 with Top-K 206 or 278 and TFS 0.9551 will
|
| 166 |
|
| 167 |
-
Temperature 2.4 with Top-K 206 or 278 and TFS 0.8413 will
|
| 168 |
|
| 169 |
-
Temperature 1.2 with Top-K 206 or 278 and TFS 0.8413 will
|
| 170 |
</details>
|
| 171 |
<details><summary>Repetition penalty:</summary>
|
| 172 |
-
Base value: 1.12082, which will
|
| 173 |
|
| 174 |
1.121: an alternative variant of 1.12082, which might fix issues with overly descriptive outputs, but 1.12082 is preferred and context and/or initial input (first) should be adjusted instead.
|
| 175 |
|
| 176 |
-
1.105 (untuned): specific value I found out during experimentation. Will
|
| 177 |
|
| 178 |
-
1.05 (untuned, not recommended): base value, which is widely used in various LLMs. Will
|
| 179 |
|
| 180 |
-
1.02612: very specific one; will
|
| 181 |
|
| 182 |
-
**Other values (might
|
| 183 |
|
| 184 |
-
Feel free to experiment with and
|
| 185 |
|
| 186 |
1.02665: similar to 1.02695, but slightly more altered creativity, with interesting developement of events, more realistic responses from other characters; no issues so far; closest to the 1.02612 one; haven't been tested thoroughly.
|
| 187 |
|
|
@@ -192,25 +225,25 @@ Feel free to experiment with and give any results.
|
|
| 192 |
1.0283 (decent): will be more direct and provide more realistic, violent scenes (if necessary), especially in uncensored models; good creativity, pays well attention to basic details, good progression of events, realistic responses from characters based on events, but noticeably shorter output and might be very "chatty".
|
| 193 |
|
| 194 |
1.0285/1.0286: will provide very interesting and creative responses, but will mess up some character details (mostly the ones that already in LLM's database)
|
| 195 |
-
Some of them will
|
| 196 |
</details>
|
| 197 |
<details><summary>Top-P:</summary>
|
| 198 |
-
Base value: 0.915, which will
|
| 199 |
|
| 200 |
0.905: pays more attention to specific details, slightly less emotions, and very close to being repetitive.
|
| 201 |
|
| 202 |
0.95/0.97: very creative and unpredictable; might be used for better models, but generally less attentive.
|
| 203 |
</details>
|
| 204 |
<details><summary>Top-A:</summary>
|
| 205 |
-
Base value: 0.07, which will
|
| 206 |
</details>
|
| 207 |
<details><summary>Other values (experimental):</summary>
|
| 208 |
-
0.043725: more attention to anatomy, but unstable and tends to be unpredictable.
|
| 209 |
|
| 210 |
-
0.2025: more creative, descriptive, "exciting" and emotional, but tends to skip some details and be slightly more stable compared to 0.045.
|
| 211 |
</details>
|
| 212 |
<details><summary>Repetition Range:</summary>
|
| 213 |
-
Base value: 64, which is the least one that will
|
| 214 |
|
| 215 |
128: will output more descriptive results, but tends to be repetitive. Not recommended.
|
| 216 |
</details>
|
|
@@ -232,14 +265,14 @@ Feel free to use these fixed values:
|
|
| 232 |
132458
|
| 233 |
</details>
|
| 234 |
<details><summary>Repetition Slope:</summary>
|
| 235 |
-
Base value: 1.12, which will help to max out most things like consistency, level of detail and etc. Recommeded for all cases. Higher values won't
|
| 236 |
|
| 237 |
All values below 1 are unstable and will cause issues.
|
| 238 |
</details>
|
| 239 |
<details><summary>TFS:</summary>
|
| 240 |
-
Base value: 0.8413, which will
|
| 241 |
|
| 242 |
-
0.9551: will
|
| 243 |
</details>
|
| 244 |
<details><summary>(Additional) recommendations for assistant mode+other tips:</summary>
|
| 245 |
Disable "Inject Chatnames" to get best results for a personal assistant.
|
|
|
|
| 30 |
- llama
|
| 31 |
- gpt2
|
| 32 |
---
|
| 33 |
+
<img src="https://gitlab.com/Azuro721/trueperfect-ai/-/raw/main/PERF.png" style="float:right; width:300px; height:300px; padding:10px;">
|
| 34 |
+
|
| 35 |
+
<audio controls src="https://gitlab.com/Azuro721/trueperfect-ai/-/raw/main/k_etones.opus"></audio>
|
| 36 |
+
|
| 37 |
+
https://gitlab.com/Azuro721/trueperfect-ai (Link to Main Repo)
|
| 38 |
|
| 39 |
# TruePerfect Sampler Settings
|
| 40 |
|
|
|
|
| 50 |
|
| 51 |
Smart Context should be disabled.
|
| 52 |
|
| 53 |
+
ContextShift should be disabled.
|
| 54 |
+
|
| 55 |
Flash Attention should be disalbed.
|
| 56 |
|
| 57 |
# General and simplified info about specific sampler parameters:
|
|
|
|
| 105 |
I personally don't recommend using it with strict settings, as it will cutoff diversity and choices, which will make text more deterministic and less "exciting".
|
| 106 |
</details>
|
| 107 |
<details><summary>Typical:</summary>
|
| 108 |
+
More "intelligent" cutoff method; might output more "surprising" and varied outputs and "tails".
|
| 109 |
|
| 110 |
Disable and don't use with strict settings, as it will cutoff more (unlikely) speciic details with lower values, which might be important to overall choices and logic of actions.
|
| 111 |
</details>
|
|
|
|
| 131 |
<details><summary>Smoothing Curve (new):</summary>
|
| 132 |
Dynamically adjusts the penalty, temperature, probability to avoid sudden changes; 1 and lower.
|
| 133 |
|
| 134 |
+
Has stronger effect with higher Repetition Penalty.
|
| 135 |
+
|
| 136 |
+
Avoid with strict settings.
|
| 137 |
+
|
| 138 |
+
Values below 0.96 are not recommended.
|
| 139 |
+
</details>
|
| 140 |
+
<details><summary>Adaptive-P (new):</summary>
|
| 141 |
+
Does not work with high Temperature; Avoid at any case.
|
| 142 |
</details>
|
| 143 |
<details><summary>DRY:</summary>
|
| 144 |
Avoid in any case.
|
|
|
|
| 154 |
|
| 155 |
# Ready-to-use AIO general settings:
|
| 156 |
<details><summary></summary>
|
| 157 |
+
<details><summary>V1 **-CREATIVE-BALANCED-**</summary>
|
| 158 |
+
<img src="https://gitlab.com/Azuro721/trueperfect-ai/-/raw/main/PERF1.png" style="float:right; width:200px; height:300px; padding:10px;">
|
| 159 |
+
|
| 160 |
+
Very fine with very good creativity, level of details, emotional connections.
|
| 161 |
+
|
| 162 |
+
Might confuse things like character names, species and etc (due to high Temperature and not low enough Top-A). **In such cases lower Repetition penalty to 1.02612**
|
| 163 |
+
</details>
|
| 164 |
+
<details><summary>V2 **-INSANE-DETAIL/ATTENTION-**</summary>
|
| 165 |
+
<img src="https://gitlab.com/Azuro721/trueperfect-ai/-/raw/main/PERF2.png" style="float:right; width:200px; height:300px; padding:10px;">
|
| 166 |
+
|
| 167 |
+
OVERKILL.
|
| 168 |
+
|
| 169 |
+
Preserves insane amount of details, attention, accuracy, and length.
|
| 170 |
+
|
| 171 |
+
Lower Repetition Penalty to get even more insane attention and accuracy, but sacrifice a bit of creativity.
|
| 172 |
+
</details>
|
| 173 |
+
<details><summary>V2-Alt **-INSANE-DETAIL/ATTENTION-ASSISTANT-**</summary>
|
| 174 |
+
<img src="https://gitlab.com/Azuro721/trueperfect-ai/-/raw/main/PERF3.png" style="float:right; width:200px; height:300px; padding:10px;">
|
| 175 |
+
|
| 176 |
+
Insane for maximum accuracy for ASSISTANT tasks; can be used with overly complexive RP scenarios.
|
| 177 |
+
</details>
|
| 178 |
|
| 179 |
To preserve the versatility, I would like to describe complete and specific sampler values below **Additional fine-tuning:**, to aim perfection for any case.
|
| 180 |
|
|
|
|
| 183 |
<details><summary>Temperature+Top-K (only higher values):</summary>
|
| 184 |
Temperature is related to Top-K, and in order to achieve perfect part for this specific parameters, both of these needs to be adjusted.
|
| 185 |
|
| 186 |
+
For example Temperature 2.4 **needs** to have Top-K 134, increased by 72.
|
| 187 |
|
| 188 |
**Acceptable values of Top-K for Temperature 2.4**: 134, 206, 278.
|
| 189 |
|
|
|
|
| 191 |
|
| 192 |
**Acceptable values of Temperature for Top-K**: 1.2, 2.4, 4.8.
|
| 193 |
|
| 194 |
+
Lower temperature will be output less "exciting" and creative results (like less emotions, variety and predictability by one output), and might trigger repetitions, which can be mostly fixed by raising Top-K.
|
| 195 |
|
| 196 |
Higher Top-K will expand the attention to smaller details, and preserve attention to multiple simultaneous events, and also can fix smaller text-related issues (like with quotation marks, asterisks, hyphens and etc.)
|
| 197 |
|
| 198 |
+
**Temperature 2.4 with Top-K 206 or 278 and TFS 0.9551 will output the best possible results as an assistant.**
|
| 199 |
|
| 200 |
+
Temperature 2.4 with Top-K 206 or 278 and TFS 0.8413 will output more attentive results, with less variety, emotions and "surprising" moments.
|
| 201 |
|
| 202 |
+
Temperature 1.2 with Top-K 206 or 278 and TFS 0.8413 will output slightly more attentive results, with even less variety, emotions and "surprising" moments.
|
| 203 |
</details>
|
| 204 |
<details><summary>Repetition penalty:</summary>
|
| 205 |
+
Base value: 1.12082, which will output more creative, emotional, varied, smart and "exciting" results. But tends to have issues with asterisks and quotation marks. Good as an **assistant**. Similar to 1.02612, but with more creativity, less descriptions, faster pace, more interesting responses across multiple characters simultaneously.
|
| 206 |
|
| 207 |
1.121: an alternative variant of 1.12082, which might fix issues with overly descriptive outputs, but 1.12082 is preferred and context and/or initial input (first) should be adjusted instead.
|
| 208 |
|
| 209 |
+
1.105 (untuned): specific value I found out during experimentation. Will output less "exciting" results, but fairly better compared to 1.05.
|
| 210 |
|
| 211 |
+
1.05 (untuned, not recommended): base value, which is widely used in various LLMs. Will output more casual values, but also omit issues with asterisks and quotation marks.
|
| 212 |
|
| 213 |
+
1.02612: very specific one; will output very descriptive, attentive and expanded results. Will try to pay attention to noticeably more things compared to other variants. Will preserve character details and much more things as events go by. Great as an **assistant.**. Great for very complex instructions, very complex character cards and complex scenes. Great attention to multiple characters.
|
| 214 |
|
| 215 |
+
**Other values (might output unstable results):**
|
| 216 |
|
| 217 |
+
Feel free to experiment with and output any results.
|
| 218 |
|
| 219 |
1.02665: similar to 1.02695, but slightly more altered creativity, with interesting developement of events, more realistic responses from other characters; no issues so far; closest to the 1.02612 one; haven't been tested thoroughly.
|
| 220 |
|
|
|
|
| 225 |
1.0283 (decent): will be more direct and provide more realistic, violent scenes (if necessary), especially in uncensored models; good creativity, pays well attention to basic details, good progression of events, realistic responses from characters based on events, but noticeably shorter output and might be very "chatty".
|
| 226 |
|
| 227 |
1.0285/1.0286: will provide very interesting and creative responses, but will mess up some character details (mostly the ones that already in LLM's database)
|
| 228 |
+
Some of them will output inconsistency with specific character parts, like body type, skin type and etc; also missing out certain details and skipping some important parts, be sure to include that and select the best one.
|
| 229 |
</details>
|
| 230 |
<details><summary>Top-P:</summary>
|
| 231 |
+
Base value: 0.915, which will output very attentive, consistent and stable results. Recommeded for all cases.
|
| 232 |
|
| 233 |
0.905: pays more attention to specific details, slightly less emotions, and very close to being repetitive.
|
| 234 |
|
| 235 |
0.95/0.97: very creative and unpredictable; might be used for better models, but generally less attentive.
|
| 236 |
</details>
|
| 237 |
<details><summary>Top-A:</summary>
|
| 238 |
+
Base value: 0.07, which will output extremely consistent, stable results, with smooth transisions and very good attention to most things. Recommeded for all cases.
|
| 239 |
</details>
|
| 240 |
<details><summary>Other values (experimental):</summary>
|
| 241 |
+
0.043725: more attention to anatomy, but unstable and tends to be unpredictable. Works well with Repetition Penalty 1.02612
|
| 242 |
|
| 243 |
+
0.2025: more creative, descriptive, "exciting" and emotional, but tends to skip some details and be slightly more stable compared to 0.045. Less accurate and prone to have more issues with such high Temperature.
|
| 244 |
</details>
|
| 245 |
<details><summary>Repetition Range:</summary>
|
| 246 |
+
Base value: 64, which is the least one that will output overall better, consistant and descriptive results. Recommeded for all cases.
|
| 247 |
|
| 248 |
128: will output more descriptive results, but tends to be repetitive. Not recommended.
|
| 249 |
</details>
|
|
|
|
| 265 |
132458
|
| 266 |
</details>
|
| 267 |
<details><summary>Repetition Slope:</summary>
|
| 268 |
+
Base value: 1.12, which will help to max out most things like consistency, level of detail and etc. Recommeded for all cases. Higher values won't output any gain.
|
| 269 |
|
| 270 |
All values below 1 are unstable and will cause issues.
|
| 271 |
</details>
|
| 272 |
<details><summary>TFS:</summary>
|
| 273 |
+
Base value: 0.8413, which will output very smart, attentive and smooth outputs. Recommeded for all cases.
|
| 274 |
|
| 275 |
+
0.9551: will output extremely descriptive and attentive results; the best one as an **assistant**; might cause issues in some LLMs, specifically overly descriptive and attentive results.
|
| 276 |
</details>
|
| 277 |
<details><summary>(Additional) recommendations for assistant mode+other tips:</summary>
|
| 278 |
Disable "Inject Chatnames" to get best results for a personal assistant.
|