--- title: Text Embding Model emoji: 🏢 colorFrom: pink colorTo: red sdk: docker app_port: 7860 pinned: false license: apache-2.0 short_description: 'This is the Emebding model for the demo application ' --- # eduai-embedder (text-embding-model Space) Tiny FastAPI service that wraps `sentence-transformers/all-MiniLM-L6-v2` (384-dim, free, CPU) behind three HTTP endpoints. Deployed on this HuggingFace Docker Space so the [eduai_platform](https://github.com/) team doesn't have to install `torch` locally. ## Why this exists Installing `torch` + `sentence-transformers` reliably on Windows + Conda is a daily-blocker. By moving embeddings into a single shared service: - New contributors clone the platform repo with **no ML deps**. - The model is loaded **once**, in one place, by one container. - We can swap to a stronger model (or hosted provider) without touching any client code. ## API | Method | Path | Auth | Body | Response | |---|---|---|---|---| | `GET` | `/` | open | — | `{status, model, dim}` | | `GET` | `/health` | open | — | `{status, model, dim}` | | `POST` | `/embed` | `X-API-Key` | `{texts: [str]}` | `{embeddings: [[float]], model, dim}` | | `POST` | `/embed_one` | `X-API-Key` | `{text: str}` | `{embedding: [float], model, dim}` | Vectors are L2-normalized so cosine similarity is just a dot product. ### Example Once the Space is live at `https://ibrahimdaud-text-embding-model.hf.space`: ```bash curl https://ibrahimdaud-text-embding-model.hf.space/health # {"status":"ok","model":"all-MiniLM-L6-v2","dim":384} curl -X POST https://ibrahimdaud-text-embding-model.hf.space/embed \ -H "Content-Type: application/json" \ -H "X-API-Key: $EMBEDDER_API_KEY" \ -d '{"texts": ["What is a quadratic?", "Define discriminant."]}' | jq .model # "all-MiniLM-L6-v2" ``` ## Local development ```bash python -m venv .venv source .venv/bin/activate # Linux / macOS # .venv\Scripts\activate # Windows pip install -r requirements.txt cp .env.example .env # then set EMBEDDER_API_KEY uvicorn app:app --reload --port 7860 # http://127.0.0.1:7860/health # http://127.0.0.1:7860/docs (Swagger UI) ``` ## Docker (mirrors what HF Spaces does) ```bash docker build -t eduai-embedder . docker run --rm -p 7860:7860 \ -e EMBEDDER_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')" \ eduai-embedder ``` ## Configuring the Space 1. **Add the secret.** Space → Settings → Variables and secrets → *New secret* → name `EMBEDDER_API_KEY`, value = a 32-char URL-safe token: ```bash python -c "import secrets; print(secrets.token_urlsafe(32))" ``` Save the same value into every team member's `eduai_platform/.env` as `EMBEDDING_API_KEY`. 2. **Push from this folder:** ```bash git add . git commit -m "deploy embedding service" git push origin main ``` First push: ~5 min (Docker build + model download). Subsequent pushes only rebuild if `requirements.txt` or `Dockerfile` change. 3. **Watch the build.** Space dashboard → Logs tab. You should see: ``` eduai-embedder INFO Loading sentence-transformers model: all-MiniLM-L6-v2 ... eduai-embedder INFO Model loaded (dim=384, ...) INFO Application startup complete. ``` 4. **Wire it into eduai_platform.** Add to `eduai_platform/.env`: ``` EMBEDDING_PROVIDER=remote EMBEDDING_API_URL=https://ibrahimdaud-text-embding-model.hf.space EMBEDDING_API_KEY= ``` ## Operations - **Cold starts.** HuggingFace Spaces puts free CPU instances to sleep after inactivity. First request after sleep takes ~30 s. The chat UI's loading indicator covers this; we may add a weekly GitHub Actions cron pinging `/health` to keep it warm. - **Rotating the API key.** Bump the secret in Space settings, then update every team `.env`. No code change. Old key is invalidated immediately. - **Switching the model.** Set `EMBEDDER_MODEL_NAME` (Space secret or Dockerfile `ARG MODEL`) and redeploy. **Important:** if `dim` changes (e.g. switching to a 768-dim model), every existing embedding in the vector store must be regenerated. ## Limits The service rejects: - batches with more than `EMBEDDER_MAX_BATCH` (default 128) texts → 400 - any text longer than `EMBEDDER_MAX_TEXT_LEN` (default 8000) chars → 400 - requests without a valid `X-API-Key` when one is configured → 401 ## License Apache 2.0 (matches the Space metadata above and the model license).