kalle07 commited on
Commit
2cdef30
·
verified ·
1 Parent(s): 034d3ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -100,13 +100,15 @@ Now it searches for keywords or similar semantic terms in the document. if it ha
100
  now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point. (In reality, it's all done with coded numbers per chunck and thats why you dont can search for single numbers or words, but dosnt matter - the principle)<br>
101
  This text snippet is then used for your answer. <br>
102
  <ul style="line-height: 1.05;">
103
- <li>If, for example, the word “XYZ” occurs 50 times in one file, not all 50 are used for answer, only the number of snippets with a fast ranking are used</li>
104
  <li>If only one snippet corresponds to your question all other snippets can negatively influence your answer because they do not fit the topic (usually 4 to 32 snippet are fine)</li>
105
  <li>If you expect multible search results in your docs try 16-snippets or more, if you expect only 2 than dont use more!</li>
106
  <li>If you use chunk-length ~2048(chars) you receive more content, if you use ~512chars you receive more facts BUT lower chunk-length are more chunks and need much longer time.</li>
107
  <li>A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.</li>
108
  <li>If a book has a table of contents or a bibliography, I would delete these pages as they often contain relevant search terms but do not help answer your question.</li>
109
  <li>If the documents small like 10-20 Pages, its better you copy the whole text inside the CHAT, some options called "pin".</li>
 
 
110
  </ul>
111
  <br>
112
  ...
 
100
  now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point. (In reality, it's all done with coded numbers per chunck and thats why you dont can search for single numbers or words, but dosnt matter - the principle)<br>
101
  This text snippet is then used for your answer. <br>
102
  <ul style="line-height: 1.05;">
103
+ <li>If, for example, the word/meaning “XYZ” occurs 50 times in one txt, not all 50 are used for answer, only the number of snippets with the best ranking are used</li>
104
  <li>If only one snippet corresponds to your question all other snippets can negatively influence your answer because they do not fit the topic (usually 4 to 32 snippet are fine)</li>
105
  <li>If you expect multible search results in your docs try 16-snippets or more, if you expect only 2 than dont use more!</li>
106
  <li>If you use chunk-length ~2048(chars) you receive more content, if you use ~512chars you receive more facts BUT lower chunk-length are more chunks and need much longer time.</li>
107
  <li>A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.</li>
108
  <li>If a book has a table of contents or a bibliography, I would delete these pages as they often contain relevant search terms but do not help answer your question.</li>
109
  <li>If the documents small like 10-20 Pages, its better you copy the whole text inside the CHAT, some options called "pin".</li>
110
+ <li>If a TXT file is embedded, you cannot create a summary! Only the snippets found are used for this purpose.</li>
111
+ <li>The same applies to word search or page search—in most cases, it does not work because it is not a word search but a search for similar expressions.</li>
112
  </ul>
113
  <br>
114
  ...