SmolVLM2-2.2B-Instruct β€” Ternary Quantized (tritplane3)

Ternary-quantized version of HuggingFaceTB/SmolVLM2-2.2B-Instruct using ternary-quant.

Compact VLM designed for edge deployment, now even smaller with ternary quantization.

Model Specifications

Property Value
Base Model HuggingFaceTB/SmolVLM2-2.2B-Instruct
Parameters 2.2B
Architecture VLM (image + text)
Quantization tritplane3 (169 layers, 10.92 effective bits)
Vision Encoder FP16 (preserved)
Compression 1.47x
Avg Reconstruction Error 0.1236
License Apache 2.0

Size Comparison

Method Size VLM Support
FP16 (original) ~4.4 GB Yes
Ternary tritplane3 1.8 GB Yes

No GGUF alternative exists for SmolVLM2.

Quality Verification

Validated during quantization (collapse score: 0.009 β€” excellent):

Test Output
Image description (demo) "A yellow circle with a diagonal line through it" (correct)
"What is machine learning?" Correct, detailed explanation of ML, algorithms, training
"Explain gravity" Accurate one-sentence explanation

Memory Requirements

Runtime Min Memory Hardware
cached (CPU) ~4 GB RAM Any
metal (Apple Silicon) ~3 GB unified M1+
cached (CUDA) ~3 GB VRAM Any NVIDIA GPU

Ideal for edge deployment β€” runs on devices with 4 GB RAM.

Quickstart

pip install ternary-quant
from ternary_quant.inference import load_ternary_model

model, processor = load_ternary_model(
    "AsadIsmail/SmolVLM2-2.2B-Instruct-ternary",
    runtime_mode="cached", device="auto"
)

inputs = processor(text="Describe this image", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
print(processor.decode(outputs[0], skip_special_tokens=True))

Collection

Part of ternary-models.

GitHub: github.com/Asad-Ismail/ternary-models | Library: github.com/Asad-Ismail/ternary-quant

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AsadIsmail/SmolVLM2-2.2B-Instruct-ternary

Space using AsadIsmail/SmolVLM2-2.2B-Instruct-ternary 1

Collection including AsadIsmail/SmolVLM2-2.2B-Instruct-ternary