Spaces:
Sleeping
Sleeping
| title: Text Embding Model | |
| emoji: π’ | |
| colorFrom: pink | |
| colorTo: red | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: 'This is the Emebding model for the demo application ' | |
| # eduai-embedder (text-embding-model Space) | |
| Tiny FastAPI service that wraps `sentence-transformers/all-MiniLM-L6-v2` | |
| (384-dim, free, CPU) behind three HTTP endpoints. Deployed on this | |
| HuggingFace Docker Space so the [eduai_platform](https://github.com/) | |
| team doesn't have to install `torch` locally. | |
| ## Why this exists | |
| Installing `torch` + `sentence-transformers` reliably on Windows + Conda | |
| is a daily-blocker. By moving embeddings into a single shared service: | |
| - New contributors clone the platform repo with **no ML deps**. | |
| - The model is loaded **once**, in one place, by one container. | |
| - We can swap to a stronger model (or hosted provider) without touching | |
| any client code. | |
| ## API | |
| | Method | Path | Auth | Body | Response | | |
| |---|---|---|---|---| | |
| | `GET` | `/` | open | β | `{status, model, dim}` | | |
| | `GET` | `/health` | open | β | `{status, model, dim}` | | |
| | `POST` | `/embed` | `X-API-Key` | `{texts: [str]}` | `{embeddings: [[float]], model, dim}` | | |
| | `POST` | `/embed_one` | `X-API-Key` | `{text: str}` | `{embedding: [float], model, dim}` | | |
| Vectors are L2-normalized so cosine similarity is just a dot product. | |
| ### Example | |
| Once the Space is live at `https://ibrahimdaud-text-embding-model.hf.space`: | |
| ```bash | |
| curl https://ibrahimdaud-text-embding-model.hf.space/health | |
| # {"status":"ok","model":"all-MiniLM-L6-v2","dim":384} | |
| curl -X POST https://ibrahimdaud-text-embding-model.hf.space/embed \ | |
| -H "Content-Type: application/json" \ | |
| -H "X-API-Key: $EMBEDDER_API_KEY" \ | |
| -d '{"texts": ["What is a quadratic?", "Define discriminant."]}' | jq .model | |
| # "all-MiniLM-L6-v2" | |
| ``` | |
| ## Local development | |
| ```bash | |
| python -m venv .venv | |
| source .venv/bin/activate # Linux / macOS | |
| # .venv\Scripts\activate # Windows | |
| pip install -r requirements.txt | |
| cp .env.example .env # then set EMBEDDER_API_KEY | |
| uvicorn app:app --reload --port 7860 | |
| # http://127.0.0.1:7860/health | |
| # http://127.0.0.1:7860/docs (Swagger UI) | |
| ``` | |
| ## Docker (mirrors what HF Spaces does) | |
| ```bash | |
| docker build -t eduai-embedder . | |
| docker run --rm -p 7860:7860 \ | |
| -e EMBEDDER_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')" \ | |
| eduai-embedder | |
| ``` | |
| ## Configuring the Space | |
| 1. **Add the secret.** Space β Settings β Variables and secrets β | |
| *New secret* β name `EMBEDDER_API_KEY`, value = a 32-char URL-safe token: | |
| ```bash | |
| python -c "import secrets; print(secrets.token_urlsafe(32))" | |
| ``` | |
| Save the same value into every team member's `eduai_platform/.env` as | |
| `EMBEDDING_API_KEY`. | |
| 2. **Push from this folder:** | |
| ```bash | |
| git add . | |
| git commit -m "deploy embedding service" | |
| git push origin main | |
| ``` | |
| First push: ~5 min (Docker build + model download). Subsequent pushes | |
| only rebuild if `requirements.txt` or `Dockerfile` change. | |
| 3. **Watch the build.** Space dashboard β Logs tab. You should see: | |
| ``` | |
| eduai-embedder INFO Loading sentence-transformers model: all-MiniLM-L6-v2 ... | |
| eduai-embedder INFO Model loaded (dim=384, ...) | |
| INFO Application startup complete. | |
| ``` | |
| 4. **Wire it into eduai_platform.** Add to `eduai_platform/.env`: | |
| ``` | |
| EMBEDDING_PROVIDER=remote | |
| EMBEDDING_API_URL=https://ibrahimdaud-text-embding-model.hf.space | |
| EMBEDDING_API_KEY=<same value as Space secret> | |
| ``` | |
| ## Operations | |
| - **Cold starts.** HuggingFace Spaces puts free CPU instances to sleep | |
| after inactivity. First request after sleep takes ~30 s. The chat UI's | |
| loading indicator covers this; we may add a weekly GitHub Actions | |
| cron pinging `/health` to keep it warm. | |
| - **Rotating the API key.** Bump the secret in Space settings, then update | |
| every team `.env`. No code change. Old key is invalidated immediately. | |
| - **Switching the model.** Set `EMBEDDER_MODEL_NAME` (Space secret or | |
| Dockerfile `ARG MODEL`) and redeploy. **Important:** if `dim` changes | |
| (e.g. switching to a 768-dim model), every existing embedding in the | |
| vector store must be regenerated. | |
| ## Limits | |
| The service rejects: | |
| - batches with more than `EMBEDDER_MAX_BATCH` (default 128) texts β 400 | |
| - any text longer than `EMBEDDER_MAX_TEXT_LEN` (default 8000) chars β 400 | |
| - requests without a valid `X-API-Key` when one is configured β 401 | |
| ## License | |
| Apache 2.0 (matches the Space metadata above and the model license). | |