Ojochegbeng commited on
Commit
2f5c196
·
verified ·
1 Parent(s): 7b0539d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -0
README.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: PansGPT Qwen3 Embedding API
3
+ emoji: 🚀
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ sdk_version: 4.44.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ app_port: 7860
12
+ short_description: Embedding model
13
+ ---
14
+
15
+ # PansGPT Qwen3 Embedding API
16
+
17
+ A stable, Docker-based API for generating text embeddings using the Qwen3-Embedding-0.6B model. This space provides a reliable service for the PansGPT application.
18
+
19
+ ## Features
20
+
21
+ - **Single Text Embedding**: Generate embeddings for individual texts
22
+ - **Batch Processing**: Process multiple texts efficiently
23
+ - **Similarity Calculation**: Compute cosine similarity between embeddings
24
+ - **Docker-based**: Stable deployment with containerization
25
+ - **Health Monitoring**: Built-in health check endpoints
26
+ - **Fallback Support**: Automatic fallback to sentence-transformers if needed
27
+
28
+ ## API Endpoints
29
+
30
+ ### 1. Single Text Embedding
31
+ ```bash
32
+ POST /api/predict
33
+ Content-Type: application/json
34
+
35
+ {
36
+ "data": ["Your text here"]
37
+ }
38
+ ```
39
+
40
+ ### 2. Batch Text Embedding
41
+ ```bash
42
+ POST /api/predict
43
+ Content-Type: application/json
44
+
45
+ {
46
+ "data": [["Text 1", "Text 2", "Text 3"]]
47
+ }
48
+ ```
49
+
50
+ ### 3. Health Check
51
+ ```bash
52
+ GET /health
53
+ ```
54
+
55
+ ## Usage Examples
56
+
57
+ ### Python
58
+ ```python
59
+ import requests
60
+ import json
61
+
62
+ # Single text embedding
63
+ response = requests.post(
64
+ "https://ojochegbeng-pansgpt.hf.space/api/predict",
65
+ json={"data": ["Hello, world!"]}
66
+ )
67
+ embedding = response.json()["data"][0]
68
+
69
+ # Batch embedding
70
+ response = requests.post(
71
+ "https://ojochegbeng-pansgpt.hf.space/api/predict",
72
+ json={"data": [["Text 1", "Text 2", "Text 3"]]}
73
+ )
74
+ embeddings = response.json()["data"][0]
75
+ ```
76
+
77
+ ### JavaScript
78
+ ```javascript
79
+ // Single text embedding
80
+ const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
81
+ method: "POST",
82
+ headers: { "Content-Type": "application/json" },
83
+ body: JSON.stringify({ data: ["Hello, world!"] })
84
+ });
85
+ const embedding = (await response.json()).data[0];
86
+
87
+ // Batch embedding
88
+ const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
89
+ method: "POST",
90
+ headers: { "Content-Type": "application/json" },
91
+ body: JSON.stringify({ data: [["Text 1", "Text 2", "Text 3"]] })
92
+ });
93
+ const embeddings = (await response.json()).data[0];
94
+ ```
95
+
96
+ ## Model Information
97
+
98
+ - **Base Model**: Qwen3-Embedding-0.6B
99
+ - **Embedding Dimension**: 1024 (Qwen3) or 384 (fallback)
100
+ - **Max Input Length**: 512 tokens
101
+ - **Device**: Auto-detects CUDA/CPU
102
+
103
+ ## Docker Configuration
104
+
105
+ This space uses Docker for stable deployment:
106
+
107
+ - **Base Image**: Python 3.11-slim
108
+ - **Port**: 7860
109
+ - **Health Check**: Built-in monitoring
110
+ - **Non-root User**: Security best practices
111
+
112
+ ## Performance
113
+
114
+ - **Single Text**: ~100-500ms (depending on hardware)
115
+ - **Batch Processing**: Optimized for multiple texts
116
+ - **Memory Usage**: ~2-4GB RAM
117
+ - **Concurrent Requests**: Supports multiple simultaneous requests
118
+
119
+ ## Integration with PansGPT
120
+
121
+ This API is specifically designed for the PansGPT application:
122
+
123
+ 1. **Stable Connection**: Docker-based deployment eliminates connection issues
124
+ 2. **Consistent Performance**: Reliable response times
125
+ 3. **Error Handling**: Comprehensive error handling and fallbacks
126
+ 4. **Monitoring**: Built-in health checks for monitoring
127
+
128
+ ## Support
129
+
130
+ For issues or questions:
131
+ - Check the health endpoint first: `/health`
132
+ - Review the logs for error details
133
+ - Ensure your input format matches the expected structure
134
+
135
+ ---
136
+
137
+ **Note**: This space is optimized for stability and reliability. The Docker-based deployment ensures consistent performance for the PansGPT application.