bflhc commited on
Commit
ed8f0d4
·
1 Parent(s): 536ca3b

Add octen.ai links to Octen-Embedding model README

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -20,7 +20,7 @@ base_model: Qwen/Qwen3-Embedding-8B
20
 
21
  # Octen-Embedding-8B-INT8
22
 
23
- Octen-Embedding-8B-INT8 is a text embedding model designed for semantic search and retrieval tasks. This model is fine-tuned from [Qwen/Qwen3-Embedding-8B](https://huggingface.co/Qwen/Qwen3-Embedding-8B) and supports multiple languages, providing high-quality embeddings for various applications.
24
 
25
  **Quantization**: This is an INT8 quantized version using bitsandbytes. INT8 quantization significantly reduces memory footprint (~50% smaller), making it suitable for deployment on resource-constrained environments. Note that while memory usage is reduced, inference speed may not necessarily improve and could be slightly slower than the BF16 version on some hardware.
26
 
@@ -62,6 +62,8 @@ Octen-Embedding-8B-INT8 is a text embedding model designed for semantic search a
62
  - **Octen-Embedding-4B**: Best in 4B category, balanced performance and efficiency
63
  - **Octen-Embedding-0.6B**: Lightweight deployment, suitable for edge devices and resource-constrained environments
64
 
 
 
65
  ---
66
 
67
  ## Experimental Results
 
20
 
21
  # Octen-Embedding-8B-INT8
22
 
23
+ Octen-Embedding-8B-INT8 is a text embedding model developed by [Octen](https://octen.ai/) for semantic search and retrieval tasks. This model is fine-tuned from [Qwen/Qwen3-Embedding-8B](https://huggingface.co/Qwen/Qwen3-Embedding-8B) and supports multiple languages, providing high-quality embeddings for various applications.
24
 
25
  **Quantization**: This is an INT8 quantized version using bitsandbytes. INT8 quantization significantly reduces memory footprint (~50% smaller), making it suitable for deployment on resource-constrained environments. Note that while memory usage is reduced, inference speed may not necessarily improve and could be slightly slower than the BF16 version on some hardware.
26
 
 
62
  - **Octen-Embedding-4B**: Best in 4B category, balanced performance and efficiency
63
  - **Octen-Embedding-0.6B**: Lightweight deployment, suitable for edge devices and resource-constrained environments
64
 
65
+ For API access, deployment solutions, and technical documentation, visit [octen.ai](https://octen.ai/).
66
+
67
  ---
68
 
69
  ## Experimental Results