File size: 4,509 Bytes
2921841
 
 
 
 
 
fbbd988
2921841
 
 
 
 
fbbd988
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
title: Text Embding Model
emoji: 🏒
colorFrom: pink
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: 'This is the Emebding model for the demo application '
---

# eduai-embedder (text-embding-model Space)

Tiny FastAPI service that wraps `sentence-transformers/all-MiniLM-L6-v2`
(384-dim, free, CPU) behind three HTTP endpoints. Deployed on this
HuggingFace Docker Space so the [eduai_platform](https://github.com/)
team doesn't have to install `torch` locally.

## Why this exists

Installing `torch` + `sentence-transformers` reliably on Windows + Conda
is a daily-blocker. By moving embeddings into a single shared service:

- New contributors clone the platform repo with **no ML deps**.
- The model is loaded **once**, in one place, by one container.
- We can swap to a stronger model (or hosted provider) without touching
  any client code.

## API

| Method | Path | Auth | Body | Response |
|---|---|---|---|---|
| `GET` | `/` | open | β€” | `{status, model, dim}` |
| `GET` | `/health` | open | β€” | `{status, model, dim}` |
| `POST` | `/embed` | `X-API-Key` | `{texts: [str]}` | `{embeddings: [[float]], model, dim}` |
| `POST` | `/embed_one` | `X-API-Key` | `{text: str}` | `{embedding: [float], model, dim}` |

Vectors are L2-normalized so cosine similarity is just a dot product.

### Example

Once the Space is live at `https://ibrahimdaud-text-embding-model.hf.space`:

```bash
curl https://ibrahimdaud-text-embding-model.hf.space/health
# {"status":"ok","model":"all-MiniLM-L6-v2","dim":384}

curl -X POST https://ibrahimdaud-text-embding-model.hf.space/embed \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EMBEDDER_API_KEY" \
  -d '{"texts": ["What is a quadratic?", "Define discriminant."]}' | jq .model
# "all-MiniLM-L6-v2"
```

## Local development

```bash
python -m venv .venv
source .venv/bin/activate         # Linux / macOS
# .venv\Scripts\activate          # Windows
pip install -r requirements.txt
cp .env.example .env              # then set EMBEDDER_API_KEY

uvicorn app:app --reload --port 7860
# http://127.0.0.1:7860/health
# http://127.0.0.1:7860/docs       (Swagger UI)
```

## Docker (mirrors what HF Spaces does)

```bash
docker build -t eduai-embedder .
docker run --rm -p 7860:7860 \
  -e EMBEDDER_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')" \
  eduai-embedder
```

## Configuring the Space

1. **Add the secret.** Space β†’ Settings β†’ Variables and secrets β†’
   *New secret* β†’ name `EMBEDDER_API_KEY`, value = a 32-char URL-safe token:
   ```bash
   python -c "import secrets; print(secrets.token_urlsafe(32))"
   ```
   Save the same value into every team member's `eduai_platform/.env` as
   `EMBEDDING_API_KEY`.

2. **Push from this folder:**
   ```bash
   git add .
   git commit -m "deploy embedding service"
   git push origin main
   ```
   First push: ~5 min (Docker build + model download). Subsequent pushes
   only rebuild if `requirements.txt` or `Dockerfile` change.

3. **Watch the build.** Space dashboard β†’ Logs tab. You should see:
   ```
   eduai-embedder INFO Loading sentence-transformers model: all-MiniLM-L6-v2 ...
   eduai-embedder INFO Model loaded (dim=384, ...)
   INFO     Application startup complete.
   ```

4. **Wire it into eduai_platform.** Add to `eduai_platform/.env`:
   ```
   EMBEDDING_PROVIDER=remote
   EMBEDDING_API_URL=https://ibrahimdaud-text-embding-model.hf.space
   EMBEDDING_API_KEY=<same value as Space secret>
   ```

## Operations

- **Cold starts.** HuggingFace Spaces puts free CPU instances to sleep
  after inactivity. First request after sleep takes ~30 s. The chat UI's
  loading indicator covers this; we may add a weekly GitHub Actions
  cron pinging `/health` to keep it warm.
- **Rotating the API key.** Bump the secret in Space settings, then update
  every team `.env`. No code change. Old key is invalidated immediately.
- **Switching the model.** Set `EMBEDDER_MODEL_NAME` (Space secret or
  Dockerfile `ARG MODEL`) and redeploy. **Important:** if `dim` changes
  (e.g. switching to a 768-dim model), every existing embedding in the
  vector store must be regenerated.

## Limits

The service rejects:
- batches with more than `EMBEDDER_MAX_BATCH` (default 128) texts β†’ 400
- any text longer than `EMBEDDER_MAX_TEXT_LEN` (default 8000) chars β†’ 400
- requests without a valid `X-API-Key` when one is configured β†’ 401

## License

Apache 2.0 (matches the Space metadata above and the model license).