Spaces:
Sleeping
title: Text Embding Model
emoji: π’
colorFrom: pink
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: 'This is the Emebding model for the demo application '
eduai-embedder (text-embding-model Space)
Tiny FastAPI service that wraps sentence-transformers/all-MiniLM-L6-v2
(384-dim, free, CPU) behind three HTTP endpoints. Deployed on this
HuggingFace Docker Space so the eduai_platform
team doesn't have to install torch locally.
Why this exists
Installing torch + sentence-transformers reliably on Windows + Conda
is a daily-blocker. By moving embeddings into a single shared service:
- New contributors clone the platform repo with no ML deps.
- The model is loaded once, in one place, by one container.
- We can swap to a stronger model (or hosted provider) without touching any client code.
API
| Method | Path | Auth | Body | Response |
|---|---|---|---|---|
GET |
/ |
open | β | {status, model, dim} |
GET |
/health |
open | β | {status, model, dim} |
POST |
/embed |
X-API-Key |
{texts: [str]} |
{embeddings: [[float]], model, dim} |
POST |
/embed_one |
X-API-Key |
{text: str} |
{embedding: [float], model, dim} |
Vectors are L2-normalized so cosine similarity is just a dot product.
Example
Once the Space is live at https://ibrahimdaud-text-embding-model.hf.space:
curl https://ibrahimdaud-text-embding-model.hf.space/health
# {"status":"ok","model":"all-MiniLM-L6-v2","dim":384}
curl -X POST https://ibrahimdaud-text-embding-model.hf.space/embed \
-H "Content-Type: application/json" \
-H "X-API-Key: $EMBEDDER_API_KEY" \
-d '{"texts": ["What is a quadratic?", "Define discriminant."]}' | jq .model
# "all-MiniLM-L6-v2"
Local development
python -m venv .venv
source .venv/bin/activate # Linux / macOS
# .venv\Scripts\activate # Windows
pip install -r requirements.txt
cp .env.example .env # then set EMBEDDER_API_KEY
uvicorn app:app --reload --port 7860
# http://127.0.0.1:7860/health
# http://127.0.0.1:7860/docs (Swagger UI)
Docker (mirrors what HF Spaces does)
docker build -t eduai-embedder .
docker run --rm -p 7860:7860 \
-e EMBEDDER_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')" \
eduai-embedder
Configuring the Space
Add the secret. Space β Settings β Variables and secrets β New secret β name
EMBEDDER_API_KEY, value = a 32-char URL-safe token:python -c "import secrets; print(secrets.token_urlsafe(32))"Save the same value into every team member's
eduai_platform/.envasEMBEDDING_API_KEY.Push from this folder:
git add . git commit -m "deploy embedding service" git push origin mainFirst push: ~5 min (Docker build + model download). Subsequent pushes only rebuild if
requirements.txtorDockerfilechange.Watch the build. Space dashboard β Logs tab. You should see:
eduai-embedder INFO Loading sentence-transformers model: all-MiniLM-L6-v2 ... eduai-embedder INFO Model loaded (dim=384, ...) INFO Application startup complete.Wire it into eduai_platform. Add to
eduai_platform/.env:EMBEDDING_PROVIDER=remote EMBEDDING_API_URL=https://ibrahimdaud-text-embding-model.hf.space EMBEDDING_API_KEY=<same value as Space secret>
Operations
- Cold starts. HuggingFace Spaces puts free CPU instances to sleep
after inactivity. First request after sleep takes ~30 s. The chat UI's
loading indicator covers this; we may add a weekly GitHub Actions
cron pinging
/healthto keep it warm. - Rotating the API key. Bump the secret in Space settings, then update
every team
.env. No code change. Old key is invalidated immediately. - Switching the model. Set
EMBEDDER_MODEL_NAME(Space secret or DockerfileARG MODEL) and redeploy. Important: ifdimchanges (e.g. switching to a 768-dim model), every existing embedding in the vector store must be regenerated.
Limits
The service rejects:
- batches with more than
EMBEDDER_MAX_BATCH(default 128) texts β 400 - any text longer than
EMBEDDER_MAX_TEXT_LEN(default 8000) chars β 400 - requests without a valid
X-API-Keywhen one is configured β 401
License
Apache 2.0 (matches the Space metadata above and the model license).