Configuration Parsing Warning: In config.json: "quantization_config.bits" must be an integer

Qwen3-VL-32B-Thinking-EXL3-3.5bpw

ExLlamaV3 quantization of Qwen/Qwen3-VL-32B-Thinking - A vision-language model with enhanced reasoning capabilities.

Quantization Details

Parameter Value
Bits per Weight 3.5 bpw
Head Bits 6 bpw
Calibration Rows 128
Calibration Context 4096 tokens
Format ExLlamaV3 (EXL3)
Size ~17 GB

Model Capabilities

  • Vision + Reasoning: Process images with chain-of-thought analysis
  • Thinking Mode: <think>...</think> tags for complex visual reasoning
  • Context Window: 32K tokens
  • Image Support: Single/multiple images, various resolutions
  • Video Support: Frame-by-frame analysis

Hardware Requirements

GPU VRAM Notes
RTX 4090 24 GB Fits with moderate context + images
RTX 3090 24 GB Works, may need lower context with large images
A100 40GB 40 GB Comfortable for all use cases

Use Cases

  • Screenshot Analysis: Understand UI, extract information
  • Document OCR: Read and interpret documents with reasoning
  • Visual Q&A: Answer questions about images with explanations
  • Code from Screenshots: Analyze and explain code in images

Usage with TabbyAPI

# config.yml
model:
  model_dir: models
  model_name: Qwen3-VL-32B-Thinking-EXL3-3.5bpw

network:
  host: 0.0.0.0
  port: 5000

model_defaults:
  max_seq_len: 16384
  cache_mode: Q4

Recommended Settings

Visual Reasoning (detailed analysis):

  • Temperature: 0.6
  • Top-P: 0.95
  • Enable thinking mode

Quick Visual Tasks (fast responses):

  • Temperature: 0.7
  • Top-P: 0.8
  • Disable thinking mode

Original Model

This is a quantization of Qwen/Qwen3-VL-32B-Thinking. All credit for the base model goes to the Qwen team at Alibaba.

License

Apache 2.0 (inherited from base model)

Downloads last month
8
Safetensors
Model size
9B params
Tensor type
F16
I16
BF16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for nullrunner/Qwen3-VL-32B-Thinking-EXL3-3.5bpw

Quantized
(26)
this model