Spaces:

BinKhoaLe1812
/

Embedding

Sleeping

App Files Files Community

Embedding / README.md

LiamKhoaLe

Init commit

ddb9445 3 months ago

preview code

raw

history blame contribute delete

1.29 kB

metadata

title: Embedding
emoji: 🐠
colorFrom: purple
colorTo: gray
sdk: docker
pinned: false
short_description: Simple API run sentence-transformers/all-MiniLM-L6-v2

Embedder Service (HuggingFace Space)

A lightweight microservice exposing sentence-transformers embeddings over HTTP.

Model: sentence-transformers/all-MiniLM-L6-v2
Sequential queueing: handles one request at a time to avoid resource spikes.

Endpoints

GET /health → { ok: true, model: string, loaded: boolean }
POST /embed
- Request:

{
  "texts": ["hello world", "another document"]
}

Response:

{
  "vectors": [[0.01, -0.02, ...], [0.03, -0.01, ...]],
  "model": "sentence-transformers/all-MiniLM-L6-v2"
}

Deploy on HF Spaces

Create a new Space (Docker type)
Upload app.py, Dockerfile, requirements.txt
Set Space hardware to CPU (Small is fine)
Space will run on port 7860 by default

Example cURL

curl -s -X POST https://binkhoale1812-embedding.hf.space/embed \
  -H 'Content-Type: application/json' \
  -d '{"texts": ["An embedding request", "Second input"]}' | jq .

Notes

The service lazily loads the model on first request.
If concurrent clients hit it, requests are serialized by a semaphore to reduce memory and CPU spikes.