kalle07 commited on
Commit
03d90b7
·
verified ·
1 Parent(s): 50050f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -50,7 +50,12 @@ BTW embedder is only a part of a good RAG<br>
50
  <li>bge-m3 (up to 8192t context length)</li>
51
  </ul>
52
  Working well, all other its up to you! Some models are very similar! (jina and qwen based not yet supported by LM-Studio)<br>
53
- With the same setting, these embedders found same 6-7 snippets out of 10 from a book. This means that only 3-4 snippets were different, but I didn't test it extensively.
 
 
 
 
 
54
  <br>
55
  <br>
56
  ...
@@ -212,11 +217,12 @@ docfetcher - https://docfetcher.sourceforge.io/en/index.html (yes old but very u
212
  <li>Snowflake/snowflake-arctic-embed-l-v2.0 (English, multi)</li>
213
  <li>intfloat/multilingual-e5-large-instruct (100 languages)</li>
214
  <li>T-Systems-onsite/german-roberta-sentence-transformer-v2</li>
 
215
  <li>mixedbread-ai/mxbai-embed-2d-large-v1</li>
216
  <li>jinaai/jina-embeddings-v2-base-en</li>
217
  <li>Qwen/Qwen3-Embedding-0.6B</li>
218
  <li>HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5</li>
219
-
220
  </ul>
221
 
222
 
 
50
  <li>bge-m3 (up to 8192t context length)</li>
51
  </ul>
52
  Working well, all other its up to you! Some models are very similar! (jina and qwen based not yet supported by LM-Studio)<br>
53
+ With the same setting, these embedders found same 6-7 snippets out of 10 from a book. This means that only 3-4 snippets were different, but I didn't test it extensively.<br>
54
+ Further tests have shown that the following models are suitable for complex tasks (German-text, but should be similar in English). Jina-DE, nomic was not that good.
55
+ <ul style="line-height: 1.05;">
56
+ <li>GTE large</li>
57
+ <li>cross-en-de-es-roberta</li>
58
+
59
  <br>
60
  <br>
61
  ...
 
217
  <li>Snowflake/snowflake-arctic-embed-l-v2.0 (English, multi)</li>
218
  <li>intfloat/multilingual-e5-large-instruct (100 languages)</li>
219
  <li>T-Systems-onsite/german-roberta-sentence-transformer-v2</li>
220
+ <li>T-Systems-onsite/cross-en-de-es-roberta-sentence-transformer (English, German, Spanish)</li>
221
  <li>mixedbread-ai/mxbai-embed-2d-large-v1</li>
222
  <li>jinaai/jina-embeddings-v2-base-en</li>
223
  <li>Qwen/Qwen3-Embedding-0.6B</li>
224
  <li>HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5</li>
225
+ <li>thenlper/gte-large</li>
226
  </ul>
227
 
228