Qwen3-VL-Embedding-8B - Inference Endpoint Ready

This is a deployment-ready version of Qwen/Qwen3-VL-Embedding-8B with a custom handler for Hugging Face Inference Endpoints.

Usage

Deploy this model on HF Inference Endpoints - it will automatically use the custom handler.

API Examples

Text embeddings:

import requests

API_URL = "https://your-endpoint-url.endpoints.huggingface.cloud"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

response = requests.post(API_URL, headers=headers, json={
    "inputs": "Hello, world!"
})
print(response.json())

Image + Text embeddings:

response = requests.post(API_URL, headers=headers, json={
    "inputs": {
        "text": "A photo of a cat",
        "image": "https://example.com/cat.jpg"
    }
})

Batch embeddings:

response = requests.post(API_URL, headers=headers, json={
    "inputs": [
        {"text": "First text"},
        {"text": "Second text", "image": "https://example.com/image.jpg"}
    ]
})

Model Details

  • Base Model: Qwen/Qwen3-VL-Embedding-8B
  • Parameters: 8.1B
  • License: Apache 2.0
  • Task: Multimodal embeddings (text + vision)
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lloydchristmasx/Qwen3-VL-Embedding-8B

Finetuned
(1)
this model