Qwen3-VL-Embedding-8B - Inference Endpoint Ready

This is a deployment-ready version of Qwen/Qwen3-VL-Embedding-8B with a custom handler for Hugging Face Inference Endpoints.

Usage

Deploy this model on HF Inference Endpoints - it will automatically use the custom handler.

API Examples

Text embeddings:

import requests

API_URL = "https://your-endpoint-url.endpoints.huggingface.cloud"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

response = requests.post(API_URL, headers=headers, json={
    "inputs": "Hello, world!"
})
print(response.json())

Image + Text embeddings:

response = requests.post(API_URL, headers=headers, json={
    "inputs": {
        "text": "A photo of a cat",
        "image": "https://example.com/cat.jpg"
    }
})

Batch embeddings:

response = requests.post(API_URL, headers=headers, json={
    "inputs": [
        {"text": "First text"},
        {"text": "Second text", "image": "https://example.com/image.jpg"}
    ]
})

Model Details

Base Model: Qwen/Qwen3-VL-Embedding-8B
Parameters: 8.1B
License: Apache 2.0
Task: Multimodal embeddings (text + vision)

Downloads last month: 9

Model tree for lloydchristmasx/Qwen3-VL-Embedding-8B

Base model

Qwen/Qwen3-VL-8B-Instruct

Finetuned

Qwen/Qwen3-VL-Embedding-8B

Finetuned

(1)

this model