Qwen2.5-VL-7B-Instruct โ€” Ternary Quantized (tritplane3)

Ternary-quantized version of Qwen/Qwen2.5-VL-7B-Instruct using ternary-quant.

Model Specifications

Property Value
Base Model Qwen/Qwen2.5-VL-7B-Instruct
Parameters 7.6B
Architecture VLM (image + text input, text output)
Quantization tritplane3 (196 layers quantized)
Vision Encoder FP16 (preserved)
License Apache 2.0

Size Comparison

Method Size VLM Support
FP16 (original) 12.7 GB Yes
Ternary tritplane3 7.2 GB Yes (vision+text)
Compression 1.8x

Few quantized alternatives exist for Qwen2.5-VL. GGUF does not support this VLM architecture.

Quality Verification

Tested with text generation โ€” produces correct, detailed output:

Prompt: "What are the three laws of thermodynamics?"

Output: The three laws of thermodynamics are fundamental principles... 1. First Law (Conservation of Energy): Energy cannot be created or destroyed... ฮ”U = Q + W...

Produces structured, accurate, well-formatted responses matching FP16 quality.

Memory Requirements

Runtime Min Memory Hardware
cached (CPU) ~10 GB RAM Any
metal (Apple Silicon) ~8 GB unified M1+
triton_memory (CUDA) ~6 GB VRAM Any NVIDIA GPU

Quickstart

pip install ternary-quant
from ternary_quant.inference import load_ternary_model

model, processor = load_ternary_model(
    "AsadIsmail/Qwen2.5-VL-7B-Instruct-ternary",
    runtime_mode="cached", device="auto"
)

inputs = processor(text="What is shown in this image?", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(outputs[0], skip_special_tokens=True))

Collection

Part of ternary-models.

GitHub: github.com/Asad-Ismail/ternary-models | Library: github.com/Asad-Ismail/ternary-quant

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AsadIsmail/Qwen2.5-VL-7B-Instruct-ternary

Finetuned
(1039)
this model

Space using AsadIsmail/Qwen2.5-VL-7B-Instruct-ternary 1

Collection including AsadIsmail/Qwen2.5-VL-7B-Instruct-ternary