Spaces:
Sleeping
Sleeping
Embedding Inference API
A FastAPI-based inference service for generating embeddings using JobBERT v2/v3, Jina AI, and Voyage AI.
Features
- Multiple Models: JobBERT v2/v3 (job-specific), Jina AI v3 (general-purpose), Voyage AI (state-of-the-art)
- RESTful API: Easy-to-use HTTP endpoints
- Batch Processing: Process multiple texts in a single request
- Task-Specific Embeddings: Support for different embedding tasks (retrieval, classification, etc.)
- Docker Ready: Easy deployment to Hugging Face Spaces or any Docker environment
Supported Models
| Model | Dimension | Max Tokens | Best For |
|---|---|---|---|
| JobBERT v2 | 768 | 512 | Job titles and descriptions |
| JobBERT v3 | 768 | 512 | Job titles (improved performance) |
| Jina AI v3 | 1024 | 8,192 | General text, long documents |
| Voyage AI | 1024 | 32,000 | High-quality embeddings (requires API key) |
Quick Start
Local Development
Install dependencies:
cd embedding pip install -r requirements.txtRun the API:
python api.pyAccess the API:
- API: http://localhost:7860
- Docs: http://localhost:7860/docs
Docker Deployment
Build the image:
docker build -t embedding-api .Run the container:
docker run -p 7860:7860 embedding-apiWith Voyage AI (optional):
docker run -p 7860:7860 -e VOYAGE_API_KEY=your_key_here embedding-api
Hugging Face Spaces Deployment
Option 1: Using Hugging Face CLI
Install Hugging Face CLI:
pip install huggingface_hub huggingface-cli loginCreate a new Space:
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Choose "Docker" as the Space SDK
- Name your space (e.g.,
your-username/embedding-api)
Clone and push:
git clone https://huggingface.co/spaces/your-username/embedding-api cd embedding-api # Copy files from embedding folder cp /path/to/embedding/Dockerfile . cp /path/to/embedding/api.py . cp /path/to/embedding/requirements.txt . cp /path/to/embedding/README.md . git add . git commit -m "Initial commit" git pushConfigure environment (optional):
- Go to your Space settings
- Add
VOYAGE_API_KEYsecret if using Voyage AI
Option 2: Manual Upload
- Create a new Docker Space on Hugging Face
- Upload these files:
Dockerfileapi.pyrequirements.txtREADME.md
- Add environment variables in Settings if needed
API Usage
Health Check
curl http://localhost:7860/health
Response:
{
"status": "healthy",
"models_loaded": ["jobbertv2", "jina"],
"voyage_available": false
}
Generate Embeddings
JobBERT v2 (Job Titles)
curl -X POST http://localhost:7860/embed \
-H "Content-Type: application/json" \
-d '{
"texts": ["Software Engineer", "Data Scientist", "Product Manager"],
"model": "jobbertv2"
}'
JobBERT v3 (Latest, Recommended)
curl -X POST http://localhost:7860/embed \
-H "Content-Type: application/json" \
-d '{
"texts": ["Software Engineer", "Data Scientist", "Product Manager"],
"model": "jobbertv3"
}'
Jina AI (with task specification)
curl -X POST http://localhost:7860/embed \
-H "Content-Type: application/json" \
-d '{
"texts": ["What is machine learning?", "How does AI work?"],
"model": "jina",
"task": "retrieval.query"
}'
Jina AI Tasks:
retrieval.query: For search queriesretrieval.passage: For documentstext-matching: For similarity (default)classification: For classificationseparation: For clustering
Voyage AI (requires API key)
curl -X POST http://localhost:7860/embed \
-H "Content-Type: application/json" \
-d '{
"texts": ["This is a document to embed"],
"model": "voyage",
"input_type": "document"
}'
Voyage AI Input Types:
document: For documents/passagesquery: For search queries
Response Format
{
"embeddings": [
[0.123, -0.456, 0.789, ...],
[0.234, -0.567, 0.890, ...]
],
"model": "jobbertv2",
"dimension": 768,
"num_texts": 2
}
List Available Models
curl http://localhost:7860/models
Python Client Example
import requests
url = "http://localhost:7860/embed"
# JobBERT v3 (recommended)
response = requests.post(url, json={
"texts": ["Software Engineer", "Data Scientist"],
"model": "jobbertv3"
})
result = response.json()
embeddings = result["embeddings"]
print(f"Got {len(embeddings)} embeddings of dimension {result['dimension']}")
# JobBERT v2
response = requests.post(url, json={
"texts": ["Product Manager"],
"model": "jobbertv2"
})
# Jina AI with task
response = requests.post(url, json={
"texts": ["What is Python?"],
"model": "jina",
"task": "retrieval.query"
})
# Voyage AI
response = requests.post(url, json={
"texts": ["Document text here"],
"model": "voyage",
"input_type": "document"
})
Environment Variables
PORT: Server port (default: 7860)VOYAGE_API_KEY: Voyage AI API key (optional, required for Voyage embeddings)
Interactive Documentation
Once the API is running, visit:
- Swagger UI: http://localhost:7860/docs
- ReDoc: http://localhost:7860/redoc
Notes
- Models are downloaded automatically on first startup (~2-3GB total)
- Voyage AI requires an API key from https://www.voyageai.com/
- First request to each model may be slower due to model loading
- Use batch processing for better performance (send multiple texts at once)
Troubleshooting
Models not loading
- Check available disk space (need ~3GB)
- Ensure internet connection for model download
- Check logs for specific error messages
Voyage AI not working
- Verify
VOYAGE_API_KEYis set correctly - Check API key has sufficient credits
- Ensure
voyageaipackage is installed
Out of memory
- Reduce batch size (process fewer texts per request)
- Use smaller models (JobBERT v2 instead of Jina)
- Increase container memory limits
License
This API uses models with different licenses:
- JobBERT v2/v3: Apache 2.0
- Jina AI: Apache 2.0
- Voyage AI: Subject to Voyage AI terms of service