Spaces:

Ojochegbeng
/

Pansgpt

Running

App Files Files Community

Pansgpt / README.md

Ojochegbeng

Create README.md

2f5c196 verified 3 months ago

preview code

raw

history blame contribute delete

3.54 kB

	---
	title: PansGPT Qwen3 Embedding API
	emoji: 🚀
	colorFrom: blue
	colorTo: green
	sdk: docker
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	license: mit
	app_port: 7860
	short_description: Embedding model
	---

	# PansGPT Qwen3 Embedding API

	A stable, Docker-based API for generating text embeddings using the Qwen3-Embedding-0.6B model. This space provides a reliable service for the PansGPT application.

	## Features

	- Single Text Embedding: Generate embeddings for individual texts
	- Batch Processing: Process multiple texts efficiently
	- Similarity Calculation: Compute cosine similarity between embeddings
	- Docker-based: Stable deployment with containerization
	- Health Monitoring: Built-in health check endpoints
	- Fallback Support: Automatic fallback to sentence-transformers if needed

	## API Endpoints

	### 1. Single Text Embedding
	```bash
	POST /api/predict
	Content-Type: application/json

	{
	"data": ["Your text here"]
	}
	```

	### 2. Batch Text Embedding
	```bash
	POST /api/predict
	Content-Type: application/json

	{
	"data": [["Text 1", "Text 2", "Text 3"]]
	}
	```

	### 3. Health Check
	```bash
	GET /health
	```

	## Usage Examples

	### Python
	```python
	import requests
	import json

	# Single text embedding
	response = requests.post(
	"https://ojochegbeng-pansgpt.hf.space/api/predict",
	json={"data": ["Hello, world!"]}
	)
	embedding = response.json()["data"][0]

	# Batch embedding
	response = requests.post(
	"https://ojochegbeng-pansgpt.hf.space/api/predict",
	json={"data": [["Text 1", "Text 2", "Text 3"]]}
	)
	embeddings = response.json()["data"][0]
	```

	### JavaScript
	```javascript
	// Single text embedding
	const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
	method: "POST",
	headers: { "Content-Type": "application/json" },
	body: JSON.stringify({ data: ["Hello, world!"] })
	});
	const embedding = (await response.json()).data[0];

	// Batch embedding
	const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", {
	method: "POST",
	headers: { "Content-Type": "application/json" },
	body: JSON.stringify({ data: [["Text 1", "Text 2", "Text 3"]] })
	});
	const embeddings = (await response.json()).data[0];
	```

	## Model Information

	- Base Model: Qwen3-Embedding-0.6B
	- Embedding Dimension: 1024 (Qwen3) or 384 (fallback)
	- Max Input Length: 512 tokens
	- Device: Auto-detects CUDA/CPU

	## Docker Configuration

	This space uses Docker for stable deployment:

	- Base Image: Python 3.11-slim
	- Port: 7860
	- Health Check: Built-in monitoring
	- Non-root User: Security best practices

	## Performance

	- Single Text: ~100-500ms (depending on hardware)
	- Batch Processing: Optimized for multiple texts
	- Memory Usage: ~2-4GB RAM
	- Concurrent Requests: Supports multiple simultaneous requests

	## Integration with PansGPT

	This API is specifically designed for the PansGPT application:

	1. Stable Connection: Docker-based deployment eliminates connection issues
	2. Consistent Performance: Reliable response times
	3. Error Handling: Comprehensive error handling and fallbacks
	4. Monitoring: Built-in health checks for monitoring

	## Support

	For issues or questions:
	- Check the health endpoint first: `/health`
	- Review the logs for error details
	- Ensure your input format matches the expected structure

	---

	Note: This space is optimized for stability and reliability. The Docker-based deployment ensures consistent performance for the PansGPT application.