Update README.md
Browse files
README.md
CHANGED
|
@@ -100,13 +100,15 @@ Now it searches for keywords or similar semantic terms in the document. if it ha
|
|
| 100 |
now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point. (In reality, it's all done with coded numbers per chunck and thats why you dont can search for single numbers or words, but dosnt matter - the principle)<br>
|
| 101 |
This text snippet is then used for your answer. <br>
|
| 102 |
<ul style="line-height: 1.05;">
|
| 103 |
-
<li>If, for example, the word “XYZ” occurs 50 times in one
|
| 104 |
<li>If only one snippet corresponds to your question all other snippets can negatively influence your answer because they do not fit the topic (usually 4 to 32 snippet are fine)</li>
|
| 105 |
<li>If you expect multible search results in your docs try 16-snippets or more, if you expect only 2 than dont use more!</li>
|
| 106 |
<li>If you use chunk-length ~2048(chars) you receive more content, if you use ~512chars you receive more facts BUT lower chunk-length are more chunks and need much longer time.</li>
|
| 107 |
<li>A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.</li>
|
| 108 |
<li>If a book has a table of contents or a bibliography, I would delete these pages as they often contain relevant search terms but do not help answer your question.</li>
|
| 109 |
<li>If the documents small like 10-20 Pages, its better you copy the whole text inside the CHAT, some options called "pin".</li>
|
|
|
|
|
|
|
| 110 |
</ul>
|
| 111 |
<br>
|
| 112 |
...
|
|
|
|
| 100 |
now a piece of text 1024token around this word “XYZ/ZYX” is cut out at this point. (In reality, it's all done with coded numbers per chunck and thats why you dont can search for single numbers or words, but dosnt matter - the principle)<br>
|
| 101 |
This text snippet is then used for your answer. <br>
|
| 102 |
<ul style="line-height: 1.05;">
|
| 103 |
+
<li>If, for example, the word/meaning “XYZ” occurs 50 times in one txt, not all 50 are used for answer, only the number of snippets with the best ranking are used</li>
|
| 104 |
<li>If only one snippet corresponds to your question all other snippets can negatively influence your answer because they do not fit the topic (usually 4 to 32 snippet are fine)</li>
|
| 105 |
<li>If you expect multible search results in your docs try 16-snippets or more, if you expect only 2 than dont use more!</li>
|
| 106 |
<li>If you use chunk-length ~2048(chars) you receive more content, if you use ~512chars you receive more facts BUT lower chunk-length are more chunks and need much longer time.</li>
|
| 107 |
<li>A question for "summary of the document" is most time not useful, if the document has an introduction or summaries its searching there if you have luck.</li>
|
| 108 |
<li>If a book has a table of contents or a bibliography, I would delete these pages as they often contain relevant search terms but do not help answer your question.</li>
|
| 109 |
<li>If the documents small like 10-20 Pages, its better you copy the whole text inside the CHAT, some options called "pin".</li>
|
| 110 |
+
<li>If a TXT file is embedded, you cannot create a summary! Only the snippets found are used for this purpose.</li>
|
| 111 |
+
<li>The same applies to word search or page search—in most cases, it does not work because it is not a word search but a search for similar expressions.</li>
|
| 112 |
</ul>
|
| 113 |
<br>
|
| 114 |
...
|