Spaces:

ibrahimdaud
/

text-embding-model

Sleeping

App Files Files Community

text-embding-model / README.md

ibrahimdaud

feat: FastAPI embedding service for eduai_platform

fbbd988 20 days ago

preview code

raw

history blame contribute delete

4.51 kB

	---
	title: Text Embding Model
	emoji: 🏢
	colorFrom: pink
	colorTo: red
	sdk: docker
	app_port: 7860
	pinned: false
	license: apache-2.0
	short_description: 'This is the Emebding model for the demo application '
	---

	# eduai-embedder (text-embding-model Space)

	Tiny FastAPI service that wraps `sentence-transformers/all-MiniLM-L6-v2`
	(384-dim, free, CPU) behind three HTTP endpoints. Deployed on this
	HuggingFace Docker Space so the [eduai_platform](https://github.com/)
	team doesn't have to install `torch` locally.

	## Why this exists

	Installing `torch` + `sentence-transformers` reliably on Windows + Conda
	is a daily-blocker. By moving embeddings into a single shared service:

	- New contributors clone the platform repo with no ML deps.
	- The model is loaded once, in one place, by one container.
	- We can swap to a stronger model (or hosted provider) without touching
	any client code.

	## API

	\| Method \| Path \| Auth \| Body \| Response \|
	\|---\|---\|---\|---\|---\|
	\| `GET` \| `/` \| open \| — \| `{status, model, dim}` \|
	\| `GET` \| `/health` \| open \| — \| `{status, model, dim}` \|
	\| `POST` \| `/embed` \| `X-API-Key` \| `{texts: [str]}` \| `{embeddings: [[float]], model, dim}` \|
	\| `POST` \| `/embed_one` \| `X-API-Key` \| `{text: str}` \| `{embedding: [float], model, dim}` \|

	Vectors are L2-normalized so cosine similarity is just a dot product.

	### Example

	Once the Space is live at `https://ibrahimdaud-text-embding-model.hf.space`:

	```bash
	curl https://ibrahimdaud-text-embding-model.hf.space/health
	# {"status":"ok","model":"all-MiniLM-L6-v2","dim":384}

	curl -X POST https://ibrahimdaud-text-embding-model.hf.space/embed \
	-H "Content-Type: application/json" \
	-H "X-API-Key: $EMBEDDER_API_KEY" \
	-d '{"texts": ["What is a quadratic?", "Define discriminant."]}' \| jq .model
	# "all-MiniLM-L6-v2"
	```

	## Local development

	```bash
	python -m venv .venv
	source .venv/bin/activate # Linux / macOS
	# .venv\Scripts\activate # Windows
	pip install -r requirements.txt
	cp .env.example .env # then set EMBEDDER_API_KEY

	uvicorn app:app --reload --port 7860
	# http://127.0.0.1:7860/health
	# http://127.0.0.1:7860/docs (Swagger UI)
	```

	## Docker (mirrors what HF Spaces does)

	```bash
	docker build -t eduai-embedder .
	docker run --rm -p 7860:7860 \
	-e EMBEDDER_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')" \
	eduai-embedder
	```

	## Configuring the Space

	1. Add the secret. Space → Settings → Variables and secrets →
	New secret → name `EMBEDDER_API_KEY`, value = a 32-char URL-safe token:
	```bash
	python -c "import secrets; print(secrets.token_urlsafe(32))"
	```
	Save the same value into every team member's `eduai_platform/.env` as
	`EMBEDDING_API_KEY`.

	2. Push from this folder:
	```bash
	git add .
	git commit -m "deploy embedding service"
	git push origin main
	```
	First push: ~5 min (Docker build + model download). Subsequent pushes
	only rebuild if `requirements.txt` or `Dockerfile` change.

	3. Watch the build. Space dashboard → Logs tab. You should see:
	```
	eduai-embedder INFO Loading sentence-transformers model: all-MiniLM-L6-v2 ...
	eduai-embedder INFO Model loaded (dim=384, ...)
	INFO Application startup complete.
	```

	4. Wire it into eduai_platform. Add to `eduai_platform/.env`:
	```
	EMBEDDING_PROVIDER=remote
	EMBEDDING_API_URL=https://ibrahimdaud-text-embding-model.hf.space
	EMBEDDING_API_KEY=<same value as Space secret>
	```

	## Operations

	- Cold starts. HuggingFace Spaces puts free CPU instances to sleep
	after inactivity. First request after sleep takes ~30 s. The chat UI's
	loading indicator covers this; we may add a weekly GitHub Actions
	cron pinging `/health` to keep it warm.
	- Rotating the API key. Bump the secret in Space settings, then update
	every team `.env`. No code change. Old key is invalidated immediately.
	- Switching the model. Set `EMBEDDER_MODEL_NAME` (Space secret or
	Dockerfile `ARG MODEL`) and redeploy. Important: if `dim` changes
	(e.g. switching to a 768-dim model), every existing embedding in the
	vector store must be regenerated.

	## Limits

	The service rejects:
	- batches with more than `EMBEDDER_MAX_BATCH` (default 128) texts → 400
	- any text longer than `EMBEDDER_MAX_TEXT_LEN` (default 8000) chars → 400
	- requests without a valid `X-API-Key` when one is configured → 401

	## License

	Apache 2.0 (matches the Space metadata above and the model license).