Spaces:

Ojochegbeng
/

Pansgpt

Running

File size: 3,538 Bytes

2f5c196

---
title: PansGPT Qwen3 Embedding API
emoji: 🚀
colorFrom: blue
colorTo: green
sdk: docker
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
app_port: 7860
short_description: Embedding model
---

# PansGPT Qwen3 Embedding API

A stable, Docker-based API for generating text embeddings using the Qwen3-Embedding-0.6B model. This space provides a reliable service for the PansGPT application.

## Features

- **Single Text Embedding**: Generate embeddings for individual texts
- **Batch Processing**: Process multiple texts efficiently
- **Similarity Calculation**: Compute cosine similarity between embeddings
- **Docker-based**: Stable deployment with containerization
- **Health Monitoring**: Built-in health check endpoints
- **Fallback Support**: Automatic fallback to sentence-transformers if needed

## API Endpoints

### 1. Single Text Embedding
```bash
POST /api/predict
Content-Type: application/json

{
    "data": ["Your text here"]
}
```

### 2. Batch Text Embedding
```bash
POST /api/predict
Content-Type: application/json

{
    "data": [["Text 1", "Text 2", "Text 3"]]
}
```

### 3. Health Check
```bash
GET /health
```

## Usage Examples

### Python
```python
import requests
import json

# Single text embedding
response = requests.post(
    "https://ojochegbeng-pansgpt.hf.space/api/predict",
    json={"data": ["Hello, world!"]}
)
embedding = response.json()["data"][0]

# Batch embedding
response = requests.post(
    "https://ojochegbeng-pansgpt.hf.space/api/predict",
    json={"data": [["Text 1", "Text 2", "Text 3"]]}
)
embeddings = response.json()["data"][0]
```

### JavaScript
```javascript
// Single text embedding
const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ data: ["Hello, world!"] })
});
const embedding = (await response.json()).data[0];

// Batch embedding
const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ data: [["Text 1", "Text 2", "Text 3"]] })
});
const embeddings = (await response.json()).data[0];
```

## Model Information

- **Base Model**: Qwen3-Embedding-0.6B
- **Embedding Dimension**: 1024 (Qwen3) or 384 (fallback)
- **Max Input Length**: 512 tokens
- **Device**: Auto-detects CUDA/CPU

## Docker Configuration

This space uses Docker for stable deployment:

- **Base Image**: Python 3.11-slim
- **Port**: 7860
- **Health Check**: Built-in monitoring
- **Non-root User**: Security best practices

## Performance

- **Single Text**: ~100-500ms (depending on hardware)
- **Batch Processing**: Optimized for multiple texts
- **Memory Usage**: ~2-4GB RAM
- **Concurrent Requests**: Supports multiple simultaneous requests

## Integration with PansGPT

This API is specifically designed for the PansGPT application:

1. **Stable Connection**: Docker-based deployment eliminates connection issues
2. **Consistent Performance**: Reliable response times
3. **Error Handling**: Comprehensive error handling and fallbacks
4. **Monitoring**: Built-in health checks for monitoring

## Support

For issues or questions:
- Check the health endpoint first: `/health`
- Review the logs for error details
- Ensure your input format matches the expected structure

---

**Note**: This space is optimized for stability and reliability. The Docker-based deployment ensures consistent performance for the PansGPT application.