Qwen3-VL-Embedding-8B - Inference Endpoint Ready
This is a deployment-ready version of Qwen/Qwen3-VL-Embedding-8B with a custom handler for Hugging Face Inference Endpoints.
Usage
Deploy this model on HF Inference Endpoints - it will automatically use the custom handler.
API Examples
Text embeddings:
import requests
API_URL = "https://your-endpoint-url.endpoints.huggingface.cloud"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
response = requests.post(API_URL, headers=headers, json={
"inputs": "Hello, world!"
})
print(response.json())
Image + Text embeddings:
response = requests.post(API_URL, headers=headers, json={
"inputs": {
"text": "A photo of a cat",
"image": "https://example.com/cat.jpg"
}
})
Batch embeddings:
response = requests.post(API_URL, headers=headers, json={
"inputs": [
{"text": "First text"},
{"text": "Second text", "image": "https://example.com/image.jpg"}
]
})
Model Details
- Base Model: Qwen/Qwen3-VL-Embedding-8B
- Parameters: 8.1B
- License: Apache 2.0
- Task: Multimodal embeddings (text + vision)
- Downloads last month
- 4