File size: 3,538 Bytes
2f5c196
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
title: PansGPT Qwen3 Embedding API
emoji: πŸš€
colorFrom: blue
colorTo: green
sdk: docker
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
app_port: 7860
short_description: Embedding model
---

# PansGPT Qwen3 Embedding API

A stable, Docker-based API for generating text embeddings using the Qwen3-Embedding-0.6B model. This space provides a reliable service for the PansGPT application.

## Features

- **Single Text Embedding**: Generate embeddings for individual texts
- **Batch Processing**: Process multiple texts efficiently
- **Similarity Calculation**: Compute cosine similarity between embeddings
- **Docker-based**: Stable deployment with containerization
- **Health Monitoring**: Built-in health check endpoints
- **Fallback Support**: Automatic fallback to sentence-transformers if needed

## API Endpoints

### 1. Single Text Embedding
```bash
POST /api/predict
Content-Type: application/json

{
    "data": ["Your text here"]
}
```

### 2. Batch Text Embedding
```bash
POST /api/predict
Content-Type: application/json

{
    "data": [["Text 1", "Text 2", "Text 3"]]
}
```

### 3. Health Check
```bash
GET /health
```

## Usage Examples

### Python
```python
import requests
import json

# Single text embedding
response = requests.post(
    "https://ojochegbeng-pansgpt.hf.space/api/predict",
    json={"data": ["Hello, world!"]}
)
embedding = response.json()["data"][0]

# Batch embedding
response = requests.post(
    "https://ojochegbeng-pansgpt.hf.space/api/predict",
    json={"data": [["Text 1", "Text 2", "Text 3"]]}
)
embeddings = response.json()["data"][0]
```

### JavaScript
```javascript
// Single text embedding
const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ data: ["Hello, world!"] })
});
const embedding = (await response.json()).data[0];

// Batch embedding
const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ data: [["Text 1", "Text 2", "Text 3"]] })
});
const embeddings = (await response.json()).data[0];
```

## Model Information

- **Base Model**: Qwen3-Embedding-0.6B
- **Embedding Dimension**: 1024 (Qwen3) or 384 (fallback)
- **Max Input Length**: 512 tokens
- **Device**: Auto-detects CUDA/CPU

## Docker Configuration

This space uses Docker for stable deployment:

- **Base Image**: Python 3.11-slim
- **Port**: 7860
- **Health Check**: Built-in monitoring
- **Non-root User**: Security best practices

## Performance

- **Single Text**: ~100-500ms (depending on hardware)
- **Batch Processing**: Optimized for multiple texts
- **Memory Usage**: ~2-4GB RAM
- **Concurrent Requests**: Supports multiple simultaneous requests

## Integration with PansGPT

This API is specifically designed for the PansGPT application:

1. **Stable Connection**: Docker-based deployment eliminates connection issues
2. **Consistent Performance**: Reliable response times
3. **Error Handling**: Comprehensive error handling and fallbacks
4. **Monitoring**: Built-in health checks for monitoring

## Support

For issues or questions:
- Check the health endpoint first: `/health`
- Review the logs for error details
- Ensure your input format matches the expected structure

---

**Note**: This space is optimized for stability and reliability. The Docker-based deployment ensures consistent performance for the PansGPT application.