VizorZ0042 commited on
Commit
0456320
·
verified ·
1 Parent(s): f18b617

Major updates for README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -7
README.md CHANGED
@@ -4,15 +4,21 @@ tags:
4
  - generation
5
  - chat settings
6
  - sampling
 
7
  - sampling-strategies
 
 
8
  - decoding
9
  - nucleus-sampling
 
10
  - optimization
11
  - experimentation
12
  - role play settings
13
  - generation-features
14
  - optimal model setting
15
  - coherence
 
 
16
  - top-k
17
  - top-p
18
  - temperature
@@ -36,7 +42,11 @@ This is my work of 6+ months; with trials and errors, I managed to achieve perfe
36
 
37
  Any Q5_K_M/Q6(K_M) model, lower will not work correctly and there's no way to fix these issues, only temporarily by adjusting specific parameters.
38
 
39
- (Optional) Kobold CPP.
 
 
 
 
40
 
41
  # General and simplified info about specific sampler parameters:
42
  <details><summary></summary>
@@ -45,11 +55,14 @@ Helps increasing creativity and usage of "smarter" words; has a strong dependenc
45
  </details>
46
  <details><summary>Top-K:</summary>
47
  Helps expanding choices, diversity (variety), level of detail, progression (per turn/output) and amount of simultaneous actions per turn; has a strong dependence on Temperature.
 
48
  Will produce very "rapidly-switching", incoherent and nonsensical output with wrong Temperature.
49
  </details>
50
  <details><summary>Repetition Penalty:</summary>
51
  Helps with repetitions on "lower-end" models and controls creativity (less with lower value); very strict and needs extremely precise adjusting to achieve stable results.
 
52
  Lower values (~1.07 and less) will produce more and more detailed output, with attention to things (smaller details), simultaneous events and correction anatomical details (like paws=paw-pads, claws and etc; not hands); tends to choose correct other details like species, body type and etc.
 
53
  Higher values (~1.105 and more) will produce shorter outputs with more creative way of phrasing and more "surprising" events, choices and shifts; combine with higher Top-K to achieve maximum level of detail, length of output (per turn), attention, and slow down the overall progression.
54
  </details>
55
  <details><summary>Top-P:</summary>
@@ -57,11 +70,14 @@ Helps to achieve very coherent and stable outputs; controls the way of phrasing,
57
  </details>
58
  <details><summary>Top-A:</summary>
59
  Helps to correct deviations related to Top-K.
 
60
  Right value will produce very stable, coherent and "rich" outputs, with correct choices, consistency, actions and overall sense of logic; needs very precise adjustement.
 
61
  Wrong value will produce random, "rapidly-switching" and often nonsensical outputs, with complete mixup of details and sense of logic.
62
  </details>
63
  <details><summary>TFS:</summary>
64
  Helps to achieve perfect sense of logic with correct details and overall structures of response; will affect the way of phrasing, logic, consistency, and overall sense.
 
65
  Needs extremely precise adjustement to achieve perfect results.
66
  </details>
67
  <details><summary>Repetition Range:</summary>
@@ -72,75 +88,117 @@ Helps to achieve much better consistency, with more attention, subtle and more e
72
  </details>
73
  <details><summary>Repetition Slope:</summary>
74
  A "softer" way of Repetition Penalty; very unstable and inconsistent.
 
75
  Values less than 1.0 only makes things incoherent and inconsistent.
 
76
  Values higher than ~1.11 won't make any changes with strict settings, only slow down the generation.
77
  </details>
78
  <details><summary>Min-P</summary>
79
  Helps increasing the coherence and logic, but only for specific cases.
 
80
  I personally don't recommend using it with strict settings, as it will cutoff diversity and choices, which will make text more deterministic and less "exciting".
81
  </details>
82
  <details><summary>Typical:</summary>
83
  More "intelligent" cutoff method; might give more "surprising" and varied outputs and "tails".
 
84
  Disable and don't use with strict settings, as it will cutoff more (unlikely) speciic details with lower values, which might be important to overall choices and logic of actions.
85
  </details>
86
  <details><summary>Presence Penalty:</summary>
87
  Avoid in any case.
 
88
  Affects the choices off context and instructions in negative way; whether used with strict settings or not, it will cause issues even with very low value (0.01 or lower)
89
  </details>
90
  <details><summary>Smoothing Factor:</summary>
91
  Avoid in any case.
 
92
  Increases stability of outputs with very randomized settings, but will cause incorrect choices later, and will cause issues with strict settings, like nonsensical choices, actions, events and etc (even with low values like 0.002).
93
  </details>
94
  <details><summary>Mirostat:</summary>
95
  Helps to steady the temperature (might replace) and dynamically adjusts the effective temperature based on more "surprising" tokens.
 
96
  Tau helps to adjust the diversity; higher - more diverse and creative; lower - more deterministic.
 
97
  Eta helps to increase the frequency of temperature updates; smaller - stable and slow; higher - faster but less coherent.
 
98
  Never managed to get stable enough results for very long period of progression, so I personally avoid it and use strict settings instead.
99
  </details>
100
  <details><summary>Smoothing Curve (new):</summary>
101
  Dynamically adjusts the penalty, temperature, probability to avoid sudden changes; 1 and lower.
 
102
  Avoid with strict settings, as it will cause instability.
103
  </details>
104
  <details><summary>DRY:</summary>
105
  Avoid in any case.
 
106
  Helps to steady any repetitions, but never managed to get stable enough outputs.
107
  </details>
108
  <details><summary>XTC:</summary>
109
  Threshold - helps to cutoff the high-probability tokens (most likely), which mostly helps with lower Temperature.
 
110
  Probability - the chance for Threshold to cutoff the desired most likely tokens.
111
  </details>
112
  </details>
113
 
114
  # Ready-to-use AIO general settings:
115
  <details><summary></summary>
116
- Main Repo:
117
-
118
- [ https://gitlab.com/Azuro721/trueperfect-ai ]
119
-
120
  <img src="PERF1.png" style="float:right; width:200px; height:300px; padding:10px;">
121
 
122
  To preserve the versatility, I would like to describe complete and specific sampler values below **Additional fine-tuning:**, to aim perfection for any case.
 
123
  Here I will describe additional values for special cases, as well as dependencies across different sampler parameters (Like Temperature+Top-K)
124
  <details><summary>Additional fine-tuning:</summary>
125
  <details><summary>Temperature+Top-K (only higher values):</summary>
126
  Temperature is related to Top-K, and in order to achieve perfect part for this specific parameters, both of these needs to be adjusted.
 
127
  For example Temperature 2.4 **needs** to have Top-K 62, increased by 72.
 
128
  **Acceptable values of Top-K for Temperature 2.4**: 134, 206, 278.
 
129
  Different values will cause inconsistency, instability and other issues.
 
130
  **Acceptable values of Temperature for Top-K**: 1.2, 2.4, 4.8.
 
131
  Lower temperature will be give less "exciting" and creative results (like less emotions, variety and predictability by one output), and might trigger repetitions, which can be mostly fixed by raising Top-K.
 
132
  Higher Top-K will expand the attention to smaller details, and preserve attention to multiple simultaneous events, and also can fix smaller text-related issues (like with quotation marks, asterisks, hyphens and etc.)
 
133
  **Temperature 2.4 with Top-K 206 or 278 and TFS 0.9551 will give the best possible results as an assistant.**
 
 
 
 
134
  </details>
135
  <details><summary>Repetition penalty:</summary>
136
- Base value: 1.12082, which will give more creative, emotional, varied, smart and "exciting" results. But tends to have issues with asterisks and quotation marks. Good as an **assistant**.
 
 
 
137
  1.105 (untuned): specific value I found out during experimentation. Will give less "exciting" results, but fairly better compared to 1.05.
 
138
  1.05 (untuned, not recommended): base value, which is widely used in various LLMs. Will give more casual values, but also omit issues with asterisks and quotation marks.
139
- 1.02612: very specific one; will give very descriptive, attentive and expanded results. Great as an **assistant.**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
  </details>
141
  <details><summary>Top-P:</summary>
142
  Base value: 0.915, which will give very attentive, consistent and stable results. Recommeded for all cases.
 
143
  0.905: pays more attention to specific details, slightly less emotions, and very close to being repetitive.
 
144
  0.95/0.97: very creative and unpredictable; might be used for better models, but generally less attentive.
145
  </details>
146
  <details><summary>Top-A:</summary>
@@ -148,33 +206,44 @@ Base value: 0.07, which will give extremely consistent, stable results, with smo
148
  </details>
149
  <details><summary>Other values (experimental):</summary>
150
  0.043725: more attention to anatomy, but unstable and tends to be unpredictable.
 
151
  0.2025: more creative, descriptive, "exciting" and emotional, but tends to skip some details and be slightly more stable compared to 0.045.
152
  </details>
153
  <details><summary>Repetition Range:</summary>
154
  Base value: 64, which is the least one that will give overall better, consistant and descriptive results. Recommeded for all cases.
 
155
  128: will output more descriptive results, but tends to be repetitive. Not recommended.
156
  </details>
157
  <details><summary>Seed:</summary>
158
  Use fixed seed to improve consistency alot.
159
  Feel free to use these fixed values:
160
  253991 **main one**
 
161
  372205 - second one
 
162
  309090 - third one
 
163
  680079
 
164
  637001
 
165
  608575
 
166
  132458
167
  </details>
168
  <details><summary>Repetition Slope:</summary>
169
  Base value: 1.12, which will help to max out most things like consistency, level of detail and etc. Recommeded for all cases. Higher values won't give any gain.
 
170
  All values below 1 are unstable and will cause issues.
171
  </details>
172
  <details><summary>TFS:</summary>
173
  Base value: 0.8413, which will give very smart, attentive and smooth outputs. Recommeded for all cases.
 
174
  0.9551: will give extremely descriptive and attentive results; the best one as an **assistant**; might cause issues in some LLMs, specifically overly descriptive and attentive results.
175
  </details>
176
  <details><summary>(Additional) recommendations for assistant mode+other tips:</summary>
177
  Disable "Inject Chatnames" to get best results for a personal assistant.
 
178
  Usage of "Separate End Tags" might improve the cases with repetitions and make responses smarter and overall better for some models.
179
  </details>
180
  </details>
 
4
  - generation
5
  - chat settings
6
  - sampling
7
+ - sampler_config
8
  - sampling-strategies
9
+ - parameters guide
10
+ - samplers guide
11
  - decoding
12
  - nucleus-sampling
13
+ - optimized
14
  - optimization
15
  - experimentation
16
  - role play settings
17
  - generation-features
18
  - optimal model setting
19
  - coherence
20
+ - steering
21
+ - high_quality
22
  - top-k
23
  - top-p
24
  - temperature
 
42
 
43
  Any Q5_K_M/Q6(K_M) model, lower will not work correctly and there's no way to fix these issues, only temporarily by adjusting specific parameters.
44
 
45
+ **CPU backend is strongly recommended for absolute results!**
46
+
47
+ Smart Context should be disabled.
48
+
49
+ Flash Attention should be disalbed.
50
 
51
  # General and simplified info about specific sampler parameters:
52
  <details><summary></summary>
 
55
  </details>
56
  <details><summary>Top-K:</summary>
57
  Helps expanding choices, diversity (variety), level of detail, progression (per turn/output) and amount of simultaneous actions per turn; has a strong dependence on Temperature.
58
+
59
  Will produce very "rapidly-switching", incoherent and nonsensical output with wrong Temperature.
60
  </details>
61
  <details><summary>Repetition Penalty:</summary>
62
  Helps with repetitions on "lower-end" models and controls creativity (less with lower value); very strict and needs extremely precise adjusting to achieve stable results.
63
+
64
  Lower values (~1.07 and less) will produce more and more detailed output, with attention to things (smaller details), simultaneous events and correction anatomical details (like paws=paw-pads, claws and etc; not hands); tends to choose correct other details like species, body type and etc.
65
+
66
  Higher values (~1.105 and more) will produce shorter outputs with more creative way of phrasing and more "surprising" events, choices and shifts; combine with higher Top-K to achieve maximum level of detail, length of output (per turn), attention, and slow down the overall progression.
67
  </details>
68
  <details><summary>Top-P:</summary>
 
70
  </details>
71
  <details><summary>Top-A:</summary>
72
  Helps to correct deviations related to Top-K.
73
+
74
  Right value will produce very stable, coherent and "rich" outputs, with correct choices, consistency, actions and overall sense of logic; needs very precise adjustement.
75
+
76
  Wrong value will produce random, "rapidly-switching" and often nonsensical outputs, with complete mixup of details and sense of logic.
77
  </details>
78
  <details><summary>TFS:</summary>
79
  Helps to achieve perfect sense of logic with correct details and overall structures of response; will affect the way of phrasing, logic, consistency, and overall sense.
80
+
81
  Needs extremely precise adjustement to achieve perfect results.
82
  </details>
83
  <details><summary>Repetition Range:</summary>
 
88
  </details>
89
  <details><summary>Repetition Slope:</summary>
90
  A "softer" way of Repetition Penalty; very unstable and inconsistent.
91
+
92
  Values less than 1.0 only makes things incoherent and inconsistent.
93
+
94
  Values higher than ~1.11 won't make any changes with strict settings, only slow down the generation.
95
  </details>
96
  <details><summary>Min-P</summary>
97
  Helps increasing the coherence and logic, but only for specific cases.
98
+
99
  I personally don't recommend using it with strict settings, as it will cutoff diversity and choices, which will make text more deterministic and less "exciting".
100
  </details>
101
  <details><summary>Typical:</summary>
102
  More "intelligent" cutoff method; might give more "surprising" and varied outputs and "tails".
103
+
104
  Disable and don't use with strict settings, as it will cutoff more (unlikely) speciic details with lower values, which might be important to overall choices and logic of actions.
105
  </details>
106
  <details><summary>Presence Penalty:</summary>
107
  Avoid in any case.
108
+
109
  Affects the choices off context and instructions in negative way; whether used with strict settings or not, it will cause issues even with very low value (0.01 or lower)
110
  </details>
111
  <details><summary>Smoothing Factor:</summary>
112
  Avoid in any case.
113
+
114
  Increases stability of outputs with very randomized settings, but will cause incorrect choices later, and will cause issues with strict settings, like nonsensical choices, actions, events and etc (even with low values like 0.002).
115
  </details>
116
  <details><summary>Mirostat:</summary>
117
  Helps to steady the temperature (might replace) and dynamically adjusts the effective temperature based on more "surprising" tokens.
118
+
119
  Tau helps to adjust the diversity; higher - more diverse and creative; lower - more deterministic.
120
+
121
  Eta helps to increase the frequency of temperature updates; smaller - stable and slow; higher - faster but less coherent.
122
+
123
  Never managed to get stable enough results for very long period of progression, so I personally avoid it and use strict settings instead.
124
  </details>
125
  <details><summary>Smoothing Curve (new):</summary>
126
  Dynamically adjusts the penalty, temperature, probability to avoid sudden changes; 1 and lower.
127
+
128
  Avoid with strict settings, as it will cause instability.
129
  </details>
130
  <details><summary>DRY:</summary>
131
  Avoid in any case.
132
+
133
  Helps to steady any repetitions, but never managed to get stable enough outputs.
134
  </details>
135
  <details><summary>XTC:</summary>
136
  Threshold - helps to cutoff the high-probability tokens (most likely), which mostly helps with lower Temperature.
137
+
138
  Probability - the chance for Threshold to cutoff the desired most likely tokens.
139
  </details>
140
  </details>
141
 
142
  # Ready-to-use AIO general settings:
143
  <details><summary></summary>
 
 
 
 
144
  <img src="PERF1.png" style="float:right; width:200px; height:300px; padding:10px;">
145
 
146
  To preserve the versatility, I would like to describe complete and specific sampler values below **Additional fine-tuning:**, to aim perfection for any case.
147
+
148
  Here I will describe additional values for special cases, as well as dependencies across different sampler parameters (Like Temperature+Top-K)
149
  <details><summary>Additional fine-tuning:</summary>
150
  <details><summary>Temperature+Top-K (only higher values):</summary>
151
  Temperature is related to Top-K, and in order to achieve perfect part for this specific parameters, both of these needs to be adjusted.
152
+
153
  For example Temperature 2.4 **needs** to have Top-K 62, increased by 72.
154
+
155
  **Acceptable values of Top-K for Temperature 2.4**: 134, 206, 278.
156
+
157
  Different values will cause inconsistency, instability and other issues.
158
+
159
  **Acceptable values of Temperature for Top-K**: 1.2, 2.4, 4.8.
160
+
161
  Lower temperature will be give less "exciting" and creative results (like less emotions, variety and predictability by one output), and might trigger repetitions, which can be mostly fixed by raising Top-K.
162
+
163
  Higher Top-K will expand the attention to smaller details, and preserve attention to multiple simultaneous events, and also can fix smaller text-related issues (like with quotation marks, asterisks, hyphens and etc.)
164
+
165
  **Temperature 2.4 with Top-K 206 or 278 and TFS 0.9551 will give the best possible results as an assistant.**
166
+
167
+ Temperature 2.4 with Top-K 206 or 278 and TFS 0.8413 will give more attentive results, with less variety, emotions and "surprising" moments.
168
+
169
+ Temperature 1.2 with Top-K 206 or 278 and TFS 0.8413 will give slightly more attentive results, with even less variety, emotions and "surprising" moments.
170
  </details>
171
  <details><summary>Repetition penalty:</summary>
172
+ Base value: 1.12082, which will give more creative, emotional, varied, smart and "exciting" results. But tends to have issues with asterisks and quotation marks. Good as an **assistant**. Similar to 1.02612, but with more creativity, less descriptions, faster pace, more interesting responses across multiple characters simultaneously.
173
+
174
+ 1.121: an alternative variant of 1.12082, which might fix issues with overly descriptive outputs, but 1.12082 is preferred and context and/or initial input (first) should be adjusted instead.
175
+
176
  1.105 (untuned): specific value I found out during experimentation. Will give less "exciting" results, but fairly better compared to 1.05.
177
+
178
  1.05 (untuned, not recommended): base value, which is widely used in various LLMs. Will give more casual values, but also omit issues with asterisks and quotation marks.
179
+
180
+ 1.02612: very specific one; will give very descriptive, attentive and expanded results. Will try to pay attention to noticeably more things compared to other variants. Will preserve character details and much more things as events go by. Great as an **assistant.**. Great for very complex instructions, very complex character cards and complex scenes. Great attention to multiple characters.
181
+
182
+ **Other values (might give unstable results):**
183
+
184
+ Feel free to experiment with and give any results.
185
+
186
+ 1.02665: similar to 1.02695, but slightly more altered creativity, with interesting developement of events, more realistic responses from other characters; no issues so far; closest to the 1.02612 one; haven't been tested thoroughly.
187
+
188
+ 1.02695: creative and more stable, with interesting developement of events, good "surprising" moments, realistic responses from other characters; no issues with initial tests; haven't been tested thoroughly.
189
+
190
+ 1.0276/1.0277/1.0278: 1.0276 might provide repetitive results; 1.0277 is similar to 1.0285/1.0286, but with better stability and steady creativity; 1.0278 is more descriptive, but might be unstable with character details and fixed character type (like Pokemon).
191
+
192
+ 1.0283: will provide more realistic and violent scenes, especially in uncensored models; might mess up some other things, skip certain parts and quicken the pace of events.
193
+
194
+ 1.0285/1.0286: will provide very interesting and creative responses, but will mess up some character details (mostly the ones that already in LLM's database)
195
+ Some of them will give inconsistency with specific character parts, like body type, skin type and etc; also missing out certain details and skipping some important parts, be sure to include that and select the best one.
196
  </details>
197
  <details><summary>Top-P:</summary>
198
  Base value: 0.915, which will give very attentive, consistent and stable results. Recommeded for all cases.
199
+
200
  0.905: pays more attention to specific details, slightly less emotions, and very close to being repetitive.
201
+
202
  0.95/0.97: very creative and unpredictable; might be used for better models, but generally less attentive.
203
  </details>
204
  <details><summary>Top-A:</summary>
 
206
  </details>
207
  <details><summary>Other values (experimental):</summary>
208
  0.043725: more attention to anatomy, but unstable and tends to be unpredictable.
209
+
210
  0.2025: more creative, descriptive, "exciting" and emotional, but tends to skip some details and be slightly more stable compared to 0.045.
211
  </details>
212
  <details><summary>Repetition Range:</summary>
213
  Base value: 64, which is the least one that will give overall better, consistant and descriptive results. Recommeded for all cases.
214
+
215
  128: will output more descriptive results, but tends to be repetitive. Not recommended.
216
  </details>
217
  <details><summary>Seed:</summary>
218
  Use fixed seed to improve consistency alot.
219
  Feel free to use these fixed values:
220
  253991 **main one**
221
+
222
  372205 - second one
223
+
224
  309090 - third one
225
+
226
  680079
227
+
228
  637001
229
+
230
  608575
231
+
232
  132458
233
  </details>
234
  <details><summary>Repetition Slope:</summary>
235
  Base value: 1.12, which will help to max out most things like consistency, level of detail and etc. Recommeded for all cases. Higher values won't give any gain.
236
+
237
  All values below 1 are unstable and will cause issues.
238
  </details>
239
  <details><summary>TFS:</summary>
240
  Base value: 0.8413, which will give very smart, attentive and smooth outputs. Recommeded for all cases.
241
+
242
  0.9551: will give extremely descriptive and attentive results; the best one as an **assistant**; might cause issues in some LLMs, specifically overly descriptive and attentive results.
243
  </details>
244
  <details><summary>(Additional) recommendations for assistant mode+other tips:</summary>
245
  Disable "Inject Chatnames" to get best results for a personal assistant.
246
+
247
  Usage of "Separate End Tags" might improve the cases with repetitions and make responses smarter and overall better for some models.
248
  </details>
249
  </details>