Spaces:
Sleeping
Sleeping
File size: 4,509 Bytes
2921841 fbbd988 2921841 fbbd988 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | ---
title: Text Embding Model
emoji: π’
colorFrom: pink
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: 'This is the Emebding model for the demo application '
---
# eduai-embedder (text-embding-model Space)
Tiny FastAPI service that wraps `sentence-transformers/all-MiniLM-L6-v2`
(384-dim, free, CPU) behind three HTTP endpoints. Deployed on this
HuggingFace Docker Space so the [eduai_platform](https://github.com/)
team doesn't have to install `torch` locally.
## Why this exists
Installing `torch` + `sentence-transformers` reliably on Windows + Conda
is a daily-blocker. By moving embeddings into a single shared service:
- New contributors clone the platform repo with **no ML deps**.
- The model is loaded **once**, in one place, by one container.
- We can swap to a stronger model (or hosted provider) without touching
any client code.
## API
| Method | Path | Auth | Body | Response |
|---|---|---|---|---|
| `GET` | `/` | open | β | `{status, model, dim}` |
| `GET` | `/health` | open | β | `{status, model, dim}` |
| `POST` | `/embed` | `X-API-Key` | `{texts: [str]}` | `{embeddings: [[float]], model, dim}` |
| `POST` | `/embed_one` | `X-API-Key` | `{text: str}` | `{embedding: [float], model, dim}` |
Vectors are L2-normalized so cosine similarity is just a dot product.
### Example
Once the Space is live at `https://ibrahimdaud-text-embding-model.hf.space`:
```bash
curl https://ibrahimdaud-text-embding-model.hf.space/health
# {"status":"ok","model":"all-MiniLM-L6-v2","dim":384}
curl -X POST https://ibrahimdaud-text-embding-model.hf.space/embed \
-H "Content-Type: application/json" \
-H "X-API-Key: $EMBEDDER_API_KEY" \
-d '{"texts": ["What is a quadratic?", "Define discriminant."]}' | jq .model
# "all-MiniLM-L6-v2"
```
## Local development
```bash
python -m venv .venv
source .venv/bin/activate # Linux / macOS
# .venv\Scripts\activate # Windows
pip install -r requirements.txt
cp .env.example .env # then set EMBEDDER_API_KEY
uvicorn app:app --reload --port 7860
# http://127.0.0.1:7860/health
# http://127.0.0.1:7860/docs (Swagger UI)
```
## Docker (mirrors what HF Spaces does)
```bash
docker build -t eduai-embedder .
docker run --rm -p 7860:7860 \
-e EMBEDDER_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')" \
eduai-embedder
```
## Configuring the Space
1. **Add the secret.** Space β Settings β Variables and secrets β
*New secret* β name `EMBEDDER_API_KEY`, value = a 32-char URL-safe token:
```bash
python -c "import secrets; print(secrets.token_urlsafe(32))"
```
Save the same value into every team member's `eduai_platform/.env` as
`EMBEDDING_API_KEY`.
2. **Push from this folder:**
```bash
git add .
git commit -m "deploy embedding service"
git push origin main
```
First push: ~5 min (Docker build + model download). Subsequent pushes
only rebuild if `requirements.txt` or `Dockerfile` change.
3. **Watch the build.** Space dashboard β Logs tab. You should see:
```
eduai-embedder INFO Loading sentence-transformers model: all-MiniLM-L6-v2 ...
eduai-embedder INFO Model loaded (dim=384, ...)
INFO Application startup complete.
```
4. **Wire it into eduai_platform.** Add to `eduai_platform/.env`:
```
EMBEDDING_PROVIDER=remote
EMBEDDING_API_URL=https://ibrahimdaud-text-embding-model.hf.space
EMBEDDING_API_KEY=<same value as Space secret>
```
## Operations
- **Cold starts.** HuggingFace Spaces puts free CPU instances to sleep
after inactivity. First request after sleep takes ~30 s. The chat UI's
loading indicator covers this; we may add a weekly GitHub Actions
cron pinging `/health` to keep it warm.
- **Rotating the API key.** Bump the secret in Space settings, then update
every team `.env`. No code change. Old key is invalidated immediately.
- **Switching the model.** Set `EMBEDDER_MODEL_NAME` (Space secret or
Dockerfile `ARG MODEL`) and redeploy. **Important:** if `dim` changes
(e.g. switching to a 768-dim model), every existing embedding in the
vector store must be regenerated.
## Limits
The service rejects:
- batches with more than `EMBEDDER_MAX_BATCH` (default 128) texts β 400
- any text longer than `EMBEDDER_MAX_TEXT_LEN` (default 8000) chars β 400
- requests without a valid `X-API-Key` when one is configured β 401
## License
Apache 2.0 (matches the Space metadata above and the model license).
|