Octen
/

Octen-Embedding-8B

Sentence Similarity

sentence-transformers

feature-extraction

text-embeddings-inference

Model card Files Files and versions

bflhc commited on Jan 15

Commit

e5ffd82

·

1 Parent(s): 5d1aa5a

add know issues

Files changed (1) hide show

README.md +12 -0

README.md CHANGED Viewed

@@ -93,6 +93,18 @@ print(f"Similarity: {similarity.item():.4f}")
 - Cross-lingual retrieval
 - Text classification with embeddings
 ## Limitations
 - Performance may vary across different domains and languages

 - Cross-lingual retrieval
 - Text classification with embeddings
+## Known Issues
+When encoding documents without any instruction prefix, you may encounter unexpected behavior due to an [upstream issue in Qwen3-Embedding](https://huggingface.co/Qwen/Qwen3-Embedding-8B/discussions/21). To avoid this issue, we recommend adding `"- "` (dash followed by space) at the beginning of your text when encoding documents:
+```python
+# Recommended: Add "- " prefix for document encoding
+documents = ["- " + doc for doc in documents]
+embeddings = model.encode(documents)
+```
+This workaround ensures consistent and expected embedding behavior.
 ## Limitations
 - Performance may vary across different domains and languages