Update README.md

Browse files

Files changed (1) hide show

README.md +16 -11

README.md CHANGED Viewed

@@ -36,12 +36,12 @@ give me a ❤️, if you like  ;)<br>
 <li>Ger-RAG-BGE-M3 (german)</li>
 <li>german-roberta</li>
 </ul>
-Working well, all other its up to you!
 <br>
-<b>Short hints for using:</b><br>
-Set your (Max Tokens)context-lenght 16000t main-model, set your embedder-model (Max Embedding Chunk Length) 1024t,set (Max Context Snippets) usual 14,
-but in ALLM its cutting all in 1024 character parts, so aprox two times or bit more ~30!
 <br>
 -> Ok what that mean!<br>
@@ -57,23 +57,26 @@ You can play and set for your needs, eg 8-snippets a 2048t, or 28-snippets a 512
 ...
 <br>
 <b>How embedding and search works:</b><br>
-You have a txt/pdf file maybe 90000words(~300pages). You ask the model lets say "what is described in chapter called XYZ in relation to person ZYX".
 Now it searches for keywords or similar semantic terms in the document. if it has found them, lets say word and meaning around “XYZ and ZYX” ,
 now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point.
-This text snippet is then used for your answer. If, for example, the word “XYZ” occurs 100 times in one file, not all 100 are found (usually only 4 to 32 snippet are used)
 <br>
-if you expect multible search results in your docs try 16snippets or more, if you expect only 2 than dont use more!
 <br>
 if you use snipets-size ~1024t you receive more content, if you use ~256t you receive more facts.
 <br>
 A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.
 <br>
 If the documents small like 10-20 Pages, its better you copy the whole text inside the prompt.
 <br>
 ...
 <br>
 Nevertheless, the main model is also <b>important</b>! especially to deal with the context length and I don't mean just the theoretical number you can set.
-Some models can handle 128k tokens, but even with 16k input the response with the same snippets as input is worse than with other models.<br>
 <br>
 <b>Important -> The Systemprompt (an example):</b><br>
 You are a helpful assistant who provides an overview of ... under the aspects of ... .
@@ -82,7 +85,7 @@ Weight each individual excerpt in order, with the most important excerpts at the
 The context of the entire article should not be given too much weight.
 Answer the user's question!
 After your answer, briefly explain why you included excerpts (1 to X) in your response and justify briefly if you considered some of them unimportant!<br>
-<i>(change it for your needs, this example works well when I consult a book about a person and a term related to them)</i><br>
 or:<br>
 You are an imaginative storyteller who crafts compelling narratives with depth, creativity, and coherence.
 Your goal is to develop rich, engaging stories that captivate readers, staying true to the themes, tone, and style appropriate for the given prompt.
@@ -91,17 +94,19 @@ When generating stories, ensure the coherence in characters, setting, and plot p
 or:<br>
 You are are a warm and engaging companion who loves to talk about cooking, recipes and the joy of food.
 Your aim is to share delicious recipes, cooking tips and the stories behind different cultures in a personal, welcoming and knowledgeable way.<br>
 <br><br>
 usual models like (works):<br>
 llama3.1, llama3.2, qwen2.5, deepseek-r1-distill, SauerkrautLM-Nemo(german) ... <br>
 (llama3 or phi3.5 are not working well) <br>
-btw. <b>Jinja</b> templates very new ... the usual templates with new/usual models are fine, but merged models have a lot of optimization potential (but dont ask me iam not a coder)<br>
 ...
 <br>
 <br>
 on discord (sevenof9)
 ...
 <br>

 <li>Ger-RAG-BGE-M3 (german)</li>
 <li>german-roberta</li>
 </ul>
+Working well, all other its up to you! (jina and qwen based not yet supported)
 <br>
+<b>Short hints for using (Example for a large context with many expected hits):</b><br>
+Set your (Max Tokens)context-lenght 16000t main-model, set your embedder-model (Max Embedding Chunk Length) 1024t,set (Max Context Snippets) 14,
+but in ALLM its cutting all in 1024 character parts, so aprox two times or bit more ~20!
 <br>
 -> Ok what that mean!<br>
 ...
 <br>
 <b>How embedding and search works:</b><br>
+You have a txt/pdf file maybe 90000words(~300pages) a book. You ask the model lets say "what is described in chapter called XYZ in relation to person ZYX".
 Now it searches for keywords or similar semantic terms in the document. if it has found them, lets say word and meaning around “XYZ and ZYX” ,
 now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point.
+This text snippet is then used for your answer. If, for example, the word “XYZ” occurs 100 times in one file, not all 100 are found.
+If only one snippet corresponds to your question all other snippets can negatively influence your answer because they do not fit the topic (usually 4 to 32 snippet are fine)
 <br>
+if you expect multible search results in your docs try 16-snippets or more, if you expect only 2 than dont use more!
 <br>
 if you use snipets-size ~1024t you receive more content, if you use ~256t you receive more facts.
 <br>
 A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.
 <br>
+If a book has a table of contents or a bibliography, I would delete these pages as they often contain relevant search terms but do not help answer the user's question.
+<br>
 If the documents small like 10-20 Pages, its better you copy the whole text inside the prompt.
 <br>
 ...
 <br>
 Nevertheless, the main model is also <b>important</b>! especially to deal with the context length and I don't mean just the theoretical number you can set.
+Some models can handle 128k or 1M tokens, but even with 16k input the response with the same snippets as input is worse than with other well developed models.<br>
 <br>
 <b>Important -> The Systemprompt (an example):</b><br>
 You are a helpful assistant who provides an overview of ... under the aspects of ... .
 The context of the entire article should not be given too much weight.
 Answer the user's question!
 After your answer, briefly explain why you included excerpts (1 to X) in your response and justify briefly if you considered some of them unimportant!<br>
+<i>(change it for your needs, this example works well when I consult a book about a person and a term related to them, the explanation was just a test for myself)</i><br>
 or:<br>
 You are an imaginative storyteller who crafts compelling narratives with depth, creativity, and coherence.
 Your goal is to develop rich, engaging stories that captivate readers, staying true to the themes, tone, and style appropriate for the given prompt.
 or:<br>
 You are are a warm and engaging companion who loves to talk about cooking, recipes and the joy of food.
 Your aim is to share delicious recipes, cooking tips and the stories behind different cultures in a personal, welcoming and knowledgeable way.<br>
+<br>
+The system prompt is weighted with a certain amount of influence around your question. You can easily test it once without or with a nonsensical system prompt.
 <br><br>
 usual models like (works):<br>
 llama3.1, llama3.2, qwen2.5, deepseek-r1-distill, SauerkrautLM-Nemo(german) ... <br>
 (llama3 or phi3.5 are not working well) <br>
+btw. <b>Jinja</b> templates very new ... the usual templates with usual models are fine, but merged models have a lot of optimization potential (but dont ask me iam not a coder)<br>
 ...
 <br>
 <br>
+...
 on discord (sevenof9)
 ...
 <br>