Ojochegbeng commited on
Commit
0a0b8f9
Β·
verified Β·
1 Parent(s): 5c88d49

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -170
README.md DELETED
@@ -1,170 +0,0 @@
1
- # Qwen3 Docker Deployment for PansGPT
2
-
3
- This folder contains all the files needed to deploy a stable, Docker-based Qwen3 embedding API to Hugging Face Spaces for your PansGPT application.
4
-
5
- ## πŸ“ Files Overview
6
-
7
- ### Core Application Files
8
- - **`app.py`** - Main FastAPI application with Qwen3-Embedding-0.6B model
9
- - **`Dockerfile`** - Optimized Docker configuration for Hugging Face Spaces
10
- - **`requirements.txt`** - Python dependencies for the application
11
-
12
- ### Integration Files
13
- - **`qwen-embedding-service-docker.ts`** - TypeScript service for your PansGPT app
14
- - **`test-pansgpt-api.js`** - Test script to verify the deployed API
15
-
16
- ### Deployment Files
17
- - **`deploy-to-hf.sh`** - Automated deployment script for Hugging Face Spaces
18
-
19
- ## πŸš€ Quick Start
20
-
21
- ### 1. Deploy to Hugging Face Spaces
22
-
23
- ```bash
24
- # Make sure you're logged in to Hugging Face
25
- huggingface-cli login --token YOUR_TOKEN
26
-
27
- # Deploy using the script
28
- ./deploy-to-hf.sh
29
- ```
30
-
31
- ### 2. Manual Deployment
32
-
33
- ```bash
34
- # Clone your space
35
- git clone https://YOUR_TOKEN@huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
36
-
37
- # Copy files to the space directory
38
- cp app.py Dockerfile requirements.txt README.md YOUR_SPACE_NAME/
39
-
40
- # Commit and push
41
- cd YOUR_SPACE_NAME
42
- git add .
43
- git commit -m "Add Qwen3 embedding API"
44
- git push
45
- ```
46
-
47
- ### 3. Test the Deployment
48
-
49
- ```bash
50
- # Test the deployed API
51
- node test-pansgpt-api.js
52
- ```
53
-
54
- ## πŸ”§ Integration with PansGPT
55
-
56
- ### Update Your .env File
57
- ```env
58
- QWEN_API_URL=https://your-username-your-space-name.hf.space/api/predict
59
- ```
60
-
61
- ### Replace Your Embedding Service
62
- 1. Copy `qwen-embedding-service-docker.ts` to `src/lib/`
63
- 2. Update your imports to use the new service
64
- 3. The new service uses direct HTTP calls instead of Gradio client
65
-
66
- ### Example Usage
67
- ```typescript
68
- import { generateEmbeddings } from './qwen-embedding-service-docker';
69
-
70
- // Generate embeddings
71
- const embeddings = await generateEmbeddings(["Your text here"]);
72
- ```
73
-
74
- ## πŸ“Š API Endpoints
75
-
76
- - **Main API**: `POST /api/predict`
77
- - **Health Check**: `GET /health`
78
- - **Web Interface**: Available at your space URL
79
-
80
- ### API Usage Examples
81
-
82
- #### Single Text Embedding
83
- ```bash
84
- curl -X POST "https://your-space.hf.space/api/predict" \
85
- -H "Content-Type: application/json" \
86
- -d '{"data": ["Your text here"]}'
87
- ```
88
-
89
- #### Batch Text Embedding
90
- ```bash
91
- curl -X POST "https://your-space.hf.space/api/predict" \
92
- -H "Content-Type: application/json" \
93
- -d '{"data": [["Text 1", "Text 2", "Text 3"]]}'
94
- ```
95
-
96
- ## 🎯 Model Information
97
-
98
- - **Model**: Qwen3-Embedding-0.6B
99
- - **Dimensions**: 1024
100
- - **Context Length**: 32K tokens
101
- - **Languages**: 100+ languages supported
102
- - **Performance**: State-of-the-art on MTEB benchmark
103
-
104
- ## πŸ” Troubleshooting
105
-
106
- ### Common Issues
107
-
108
- 1. **Space Not Building**
109
- - Check the space logs in Hugging Face
110
- - Ensure all files are properly uploaded
111
- - Verify Dockerfile syntax
112
-
113
- 2. **API Not Responding**
114
- - Wait 2-5 minutes for the space to fully start
115
- - Check the health endpoint: `/health`
116
- - Verify the space is running (not sleeping)
117
-
118
- 3. **Embedding Errors**
119
- - Check model loading in the logs
120
- - Verify input text format
121
- - Ensure text is not too long (max 512 tokens)
122
-
123
- ### Health Check
124
- ```bash
125
- curl https://your-space.hf.space/health
126
- ```
127
-
128
- Expected response:
129
- ```json
130
- {
131
- "status": "healthy",
132
- "model_loaded": true
133
- }
134
- ```
135
-
136
- ## πŸ“ˆ Performance
137
-
138
- - **Response Time**: 100-500ms per request
139
- - **Memory Usage**: 2-4GB RAM
140
- - **Concurrent Requests**: Multiple simultaneous requests supported
141
- - **Uptime**: Much more stable than Gradio client connections
142
-
143
- ## πŸ”„ Updates
144
-
145
- To update your deployed space:
146
-
147
- 1. Make changes to the files in this folder
148
- 2. Upload the updated files to your Hugging Face Space
149
- 3. The space will automatically rebuild with the new changes
150
-
151
- ## πŸ“ Notes
152
-
153
- - This Docker-based deployment is much more stable than the previous Gradio client approach
154
- - The Qwen3 model provides better embeddings than the previous Qwen2.5 model
155
- - All files are optimized for Hugging Face Spaces deployment
156
- - The service includes comprehensive error handling and fallback mechanisms
157
-
158
- ## πŸ†˜ Support
159
-
160
- If you encounter issues:
161
- 1. Check the space logs in Hugging Face
162
- 2. Verify your API URL is correct
163
- 3. Ensure the space is running and not sleeping
164
- 4. Test with the provided test script
165
-
166
- ---
167
-
168
- **Deployment Status**: βœ… Ready for production use
169
- **Last Updated**: September 2025
170
- **Model Version**: Qwen3-Embedding-0.6B