Spaces:
Running
Running
File size: 3,538 Bytes
2f5c196 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
---
title: PansGPT Qwen3 Embedding API
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
app_port: 7860
short_description: Embedding model
---
# PansGPT Qwen3 Embedding API
A stable, Docker-based API for generating text embeddings using the Qwen3-Embedding-0.6B model. This space provides a reliable service for the PansGPT application.
## Features
- **Single Text Embedding**: Generate embeddings for individual texts
- **Batch Processing**: Process multiple texts efficiently
- **Similarity Calculation**: Compute cosine similarity between embeddings
- **Docker-based**: Stable deployment with containerization
- **Health Monitoring**: Built-in health check endpoints
- **Fallback Support**: Automatic fallback to sentence-transformers if needed
## API Endpoints
### 1. Single Text Embedding
```bash
POST /api/predict
Content-Type: application/json
{
"data": ["Your text here"]
}
```
### 2. Batch Text Embedding
```bash
POST /api/predict
Content-Type: application/json
{
"data": [["Text 1", "Text 2", "Text 3"]]
}
```
### 3. Health Check
```bash
GET /health
```
## Usage Examples
### Python
```python
import requests
import json
# Single text embedding
response = requests.post(
"https://ojochegbeng-pansgpt.hf.space/api/predict",
json={"data": ["Hello, world!"]}
)
embedding = response.json()["data"][0]
# Batch embedding
response = requests.post(
"https://ojochegbeng-pansgpt.hf.space/api/predict",
json={"data": [["Text 1", "Text 2", "Text 3"]]}
)
embeddings = response.json()["data"][0]
```
### JavaScript
```javascript
// Single text embedding
const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ data: ["Hello, world!"] })
});
const embedding = (await response.json()).data[0];
// Batch embedding
const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ data: [["Text 1", "Text 2", "Text 3"]] })
});
const embeddings = (await response.json()).data[0];
```
## Model Information
- **Base Model**: Qwen3-Embedding-0.6B
- **Embedding Dimension**: 1024 (Qwen3) or 384 (fallback)
- **Max Input Length**: 512 tokens
- **Device**: Auto-detects CUDA/CPU
## Docker Configuration
This space uses Docker for stable deployment:
- **Base Image**: Python 3.11-slim
- **Port**: 7860
- **Health Check**: Built-in monitoring
- **Non-root User**: Security best practices
## Performance
- **Single Text**: ~100-500ms (depending on hardware)
- **Batch Processing**: Optimized for multiple texts
- **Memory Usage**: ~2-4GB RAM
- **Concurrent Requests**: Supports multiple simultaneous requests
## Integration with PansGPT
This API is specifically designed for the PansGPT application:
1. **Stable Connection**: Docker-based deployment eliminates connection issues
2. **Consistent Performance**: Reliable response times
3. **Error Handling**: Comprehensive error handling and fallbacks
4. **Monitoring**: Built-in health checks for monitoring
## Support
For issues or questions:
- Check the health endpoint first: `/health`
- Review the logs for error details
- Ensure your input format matches the expected structure
---
**Note**: This space is optimized for stability and reliability. The Docker-based deployment ensures consistent performance for the PansGPT application. |