--- title: Embedding emoji: 🐠 colorFrom: purple colorTo: gray sdk: docker pinned: false short_description: Simple API run sentence-transformers/all-MiniLM-L6-v2 --- # Embedder Service (HuggingFace Space) A lightweight microservice exposing sentence-transformers embeddings over HTTP. - Model: `sentence-transformers/all-MiniLM-L6-v2` - Sequential queueing: handles one request at a time to avoid resource spikes. ## Endpoints - `GET /health` → `{ ok: true, model: string, loaded: boolean }` - `POST /embed` - Request: ``` { "texts": ["hello world", "another document"] } ``` - Response: ``` { "vectors": [[0.01, -0.02, ...], [0.03, -0.01, ...]], "model": "sentence-transformers/all-MiniLM-L6-v2" } ``` ## Deploy on HF Spaces 1. Create a new Space (Docker type) 2. Upload `app.py`, `Dockerfile`, `requirements.txt` 3. Set Space hardware to CPU (Small is fine) 4. Space will run on port 7860 by default ## Example cURL ``` curl -s -X POST https://binkhoale1812-embedding.hf.space/embed \ -H 'Content-Type: application/json' \ -d '{"texts": ["An embedding request", "Second input"]}' | jq . ``` ## Notes - The service lazily loads the model on first request. - If concurrent clients hit it, requests are serialized by a semaphore to reduce memory and CPU spikes.