chulcher commited on
Commit
50f8389
·
verified ·
1 Parent(s): 8937cc3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - mlx
4
+ - embeddings
5
+ - apple-silicon
6
+ - sentence-transformers
7
+ license: apache-2.0
8
+ base_model: Octen/Octen-Embedding-8B
9
+ library_name: mlx
10
+ ---
11
+
12
+ # Octen-Embedding-8B-mlx
13
+
14
+ Pre-converted [MLX](https://github.com/ml-explore/mlx) weights for [Octen-Embedding-8B](https://huggingface.co/Octen/Octen-Embedding-8B), ready to run on Apple Silicon.
15
+
16
+ ## Why this exists
17
+
18
+ The original model requires a ~30 minute conversion step and ~32GB temporary disk space. This repo provides the already-converted MLX weights so you can start embedding immediately.
19
+
20
+ ## Usage
21
+
22
+ With [octen-embeddings-server](https://github.com/c-h-/octen-embeddings-server):
23
+
24
+ ```bash
25
+ # Clone the server
26
+ git clone https://github.com/c-h-/octen-embeddings-server.git
27
+ cd octen-embeddings-server
28
+ pip install -r requirements.txt
29
+
30
+ # Download pre-converted weights (instead of running convert_model.py)
31
+ huggingface-cli download chulcher/Octen-Embedding-8B-mlx --local-dir models/Octen-Embedding-8B-mlx
32
+
33
+ # Start the server
34
+ python3 server.py
35
+ ```
36
+
37
+ The server exposes an OpenAI-compatible `/v1/embeddings` endpoint at `http://localhost:8100`.
38
+
39
+ ## Hardware Requirements
40
+
41
+ | Component | Requirement |
42
+ |-----------|-------------|
43
+ | CPU | Apple Silicon (M1/M2/M3/M4) |
44
+ | RAM | 20 GB+ |
45
+ | Disk | ~16 GB for weights |
46
+ | OS | macOS 13+ |
47
+
48
+ ## Performance
49
+
50
+ Octen-Embedding-8B ranks #1 on MTEB/RTEB with a score of 0.8045, outperforming commercial embedding APIs.
51
+
52
+ Typical latency on Apple Silicon: ~50-200ms per text depending on length.
53
+
54
+ ## License
55
+
56
+ Apache 2.0 (same as base model)