Embedding / README.md
LiamKhoaLe's picture
Init commit
ddb9445
metadata
title: Embedding
emoji: 🐠
colorFrom: purple
colorTo: gray
sdk: docker
pinned: false
short_description: Simple API run sentence-transformers/all-MiniLM-L6-v2

Embedder Service (HuggingFace Space)

A lightweight microservice exposing sentence-transformers embeddings over HTTP.

  • Model: sentence-transformers/all-MiniLM-L6-v2
  • Sequential queueing: handles one request at a time to avoid resource spikes.

Endpoints

  • GET /health{ ok: true, model: string, loaded: boolean }
  • POST /embed
    • Request:
{
  "texts": ["hello world", "another document"]
}
  • Response:
{
  "vectors": [[0.01, -0.02, ...], [0.03, -0.01, ...]],
  "model": "sentence-transformers/all-MiniLM-L6-v2"
}

Deploy on HF Spaces

  1. Create a new Space (Docker type)
  2. Upload app.py, Dockerfile, requirements.txt
  3. Set Space hardware to CPU (Small is fine)
  4. Space will run on port 7860 by default

Example cURL

curl -s -X POST https://binkhoale1812-embedding.hf.space/embed \
  -H 'Content-Type: application/json' \
  -d '{"texts": ["An embedding request", "Second input"]}' | jq .

Notes

  • The service lazily loads the model on first request.
  • If concurrent clients hit it, requests are serialized by a semaphore to reduce memory and CPU spikes.