File size: 7,296 Bytes
1a00a4e 5c8a0c3 1a00a4e d715b32 1a00a4e d15b789 d715b32 d15b789 1a00a4e 325c0c7 1a00a4e 325c0c7 1a00a4e 325c0c7 1a00a4e 53d8f76 1a00a4e 325c0c7 1a00a4e 325c0c7 1a00a4e 325c0c7 53d8f76 325c0c7 1a00a4e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 | ---
language:
- en
- zh
- multilingual
license: apache-2.0
library_name: sentence-transformers
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- embedding
- text-embedding
- retrieval
- onnx
pipeline_tag: sentence-similarity
base_model: Qwen/Qwen3-Embedding-0.6B
---
# Octen-Embedding-0.6B
Octen-Embedding-0.6B is a text embedding model developed by [Octen](https://octen.ai/) for semantic search and retrieval tasks. This model is fine-tuned from [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) and supports multiple languages, providing high-quality embeddings for various applications.
## Key Highlights
### 🥇 RTEB Leaderboard Champion (as of January 12, 2026)
- **Octen-Embedding-8B ranks #1 on the [RTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)** with Mean (Task) score of **0.8045**
- Excellent performance on both Public (0.7953) and Private (0.8157) datasets
- Demonstrates true generalization capability without overfitting to public benchmarks
### Industry-Oriented Vertical Domain Expertise
- **Legal**: Legal document retrieval
- **Finance**: Financial reports, Q&A, and personal finance content
- **Healthcare**: Medical Q&A, clinical dialogues, and health consultations
- **Code**: Programming problems, code search, and SQL queries
### Ultra-Long Context Support
- Supports up to **32,768 tokens** context length
- Suitable for processing long documents in legal, healthcare, and other domains
- High-dimensional embedding space for rich semantic representation
### Multilingual Capability
- Supports **100+ languages**
- Includes various programming languages
- Strong multilingual, cross-lingual, and code retrieval capabilities
---
## Open Source Model List
| Model Type | Model | Size | Max Tokens | Embedding Dimensions | HuggingFace Link |
|------------|-------|------|------------|---------------------|------------------|
| Text Embedding | [Octen-Embedding-0.6B](https://huggingface.co/Octen/Octen-Embedding-0.6B) | 0.6B | 32,768 | 1024 | ✅ Available |
| Text Embedding | [Octen-Embedding-4B](https://huggingface.co/Octen/Octen-Embedding-4B) | 4.0B | 32,768 | 2560 | ✅ Available |
| Text Embedding | [Octen-Embedding-8B](https://huggingface.co/Octen/Octen-Embedding-8B) | 7.6B | 32,768 | 4096 | ✅ Available |
**Model Family Design**:
- **Octen-Embedding-8B**: Best performance, RTEB #1, for high-precision retrieval
- **Octen-Embedding-4B**: Best in 4B category, balanced performance and efficiency
- **Octen-Embedding-0.6B**: Lightweight deployment, suitable for edge devices and resource-constrained environments
For API access, deployment solutions, and technical documentation, visit [octen.ai](https://octen.ai/).
---
## Experimental Results
### RTEB Leaderboard (Overall Performance)
| Model | Embedding Dim | Max Tokens | Mean (Public) | Mean (Private) | Mean (Task) |
|-------|---------------|------------|---------------|----------------|-------------|
| **Octen-Embedding-8B** | **4096** | **32768** | **0.7953** | **0.8157** | **0.8045** |
| voyage-3-large | 1024 | 32000 | 0.7434 | 0.8277 | 0.7812 |
| gemini-embedding-001 | 3072 | 2048 | 0.7218 | 0.8075 | 0.7602 |
| **Octen-Embedding-4B** | **2560** | **32768** | **0.7747** | **0.7942** | **0.7834** |
| MoD-Embedding | 2560 | 32768 | 0.7642 | 0.7900 | 0.7758 |
| Qwen3-Embedding-8B | 4096 | 32768 | 0.7310 | 0.7838 | 0.7547 |
| **Octen-Embedding-0.6B** | **1024** | **32768** | **0.7241** | **-** | **-** |
| voyage-3.5 | 1024 | 32000 | 0.7139 | 0.8102 | 0.7571 |
| Cohere-embed-v4.0 | 1536 | 128000 | 0.6534 | 0.7943 | 0.7166 |
| jina-embeddings-v4 | 2048 | 32768 | 0.6652 | 0.7664 | 0.7105 |
| GritLM-7B | 4096 | 32768 | 0.6187 | 0.7385 | 0.6724 |
| text-embedding-3-large | 3072 | 8191 | 0.6110 | 0.7130 | 0.6567 |
| e5-mistral-7b-instruct | 4096 | 32768 | 0.5090 | 0.7091 | 0.5987 |
| NV-Embed-v2 | 4096 | 32768 | 0.5805 | 0.6691 | 0.6203 |
| snowflake-arctic-embed-l-v2.0 | 1024 | 8192 | 0.5395 | 0.7079 | 0.6150 |
| multilingual-e5-large-instruct | 1024 | 514 | 0.5478 | 0.6859 | 0.6097 |
| gte-multilingual-base | 768 | 8192 | 0.5291 | 0.6697 | 0.5921 |
| text-embedding-3-small | 1536 | 8191 | 0.5260 | 0.6630 | 0.5874 |
| bge-m3 | 1024 | 8194 | 0.5216 | 0.6726 | 0.5893 |
| Qwen3-Embedding-4B | 2560 | 32768 | - | 0.7711 | - |
| Qwen3-Embedding-0.6B | 1024 | 32768 | - | 0.7117 | - |
---
## Model Details
- **Base Model**: [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)
- **Model Size**: 0.6B parameters
- **Max Sequence Length**: 32,768 tokens
- **Embedding Dimension**: 1024
- **Languages**: English, Chinese, and multilingual support
- **Training Method**: LoRA fine-tuning
## Usage
### Using Sentence Transformers
```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Octen/Octen-Embedding-0.6B")
# Encode sentences
sentences = [
"This is an example sentence",
"Each sentence is converted to a vector"
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# Output: (2, 1024)
# Compute similarity
from sentence_transformers.util import cos_sim
similarity = cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {similarity.item():.4f}")
```
### Using Transformers
```python
from transformers import AutoModel, AutoTokenizer
import torch
import torch.nn.functional as F
tokenizer = AutoTokenizer.from_pretrained("Octen/Octen-Embedding-0.6B", padding_side="left")
model = AutoModel.from_pretrained("Octen/Octen-Embedding-0.6B")
model.eval()
def encode(texts):
inputs = tokenizer(texts, padding=True, truncation=True,
max_length=8192, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
# Use last token embedding
embeddings = outputs.last_hidden_state[:, -1, :]
# Normalize embeddings
embeddings = F.normalize(embeddings, p=2, dim=1)
return embeddings
# Example usage
texts = ["Hello world", "你好世界"]
embeddings = encode(texts)
similarity = torch.matmul(embeddings[0], embeddings[1])
print(f"Similarity: {similarity.item():.4f}")
```
## Recommended Use Cases
- Semantic search and information retrieval
- Document similarity and clustering
- Question answering
- Cross-lingual retrieval
- Text classification with embeddings
## Limitations
- Performance may vary across different domains and languages
- Very long documents (>32K tokens) require truncation
- Optimized for retrieval tasks, not for text generation
## License
This model is licensed under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
This model is derived from [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B), which is also licensed under Apache License 2.0.
## Paper
For more details, please refer to our blog post: [Octen Series: Optimizing Embedding Models to #1 on RTEB Leaderboard](https://octen-team.github.io/octen_blog/posts/octen-rteb-first-place/)
## Citation
If you find our work helpful, please consider citing:
```bibtex
@misc{octen2025rteb,
title={Octen Series: Optimizing Embedding Models to #1 on RTEB Leaderboard},
author={Octen Team},
year={2025},
url={https://octen-team.github.io/octen_blog/posts/octen-rteb-first-place/}
}
```
|