File size: 1,293 Bytes
ddb9445
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
title: Embedding
emoji: 🐠
colorFrom: purple
colorTo: gray
sdk: docker
pinned: false
short_description: Simple API run sentence-transformers/all-MiniLM-L6-v2
---

# Embedder Service (HuggingFace Space)

A lightweight microservice exposing sentence-transformers embeddings over HTTP.

- Model: `sentence-transformers/all-MiniLM-L6-v2`
- Sequential queueing: handles one request at a time to avoid resource spikes.

## Endpoints

- `GET /health``{ ok: true, model: string, loaded: boolean }`
- `POST /embed`
  - Request:

```
{
  "texts": ["hello world", "another document"]
}
```

  - Response:

```
{
  "vectors": [[0.01, -0.02, ...], [0.03, -0.01, ...]],
  "model": "sentence-transformers/all-MiniLM-L6-v2"
}
```

## Deploy on HF Spaces

1. Create a new Space (Docker type)
2. Upload `app.py`, `Dockerfile`, `requirements.txt`
3. Set Space hardware to CPU (Small is fine)
4. Space will run on port 7860 by default

## Example cURL

```
curl -s -X POST https://binkhoale1812-embedding.hf.space/embed \
  -H 'Content-Type: application/json' \
  -d '{"texts": ["An embedding request", "Second input"]}' | jq .
```

## Notes

- The service lazily loads the model on first request.
- If concurrent clients hit it, requests are serialized by a semaphore to reduce memory and CPU spikes.