Youtu-VL-4B-Instruct-GGUF

This repository contains GGUF format model files for Tencent's Youtu-VL-4B-Instruct.

These files are compatible with llama.cpp, LM Studio, and other tools that support GGUF.

⚠️ CRITICAL: How to Use (Vision Capabilities)

This is a Vision-Language Model. To use its "eyes" (image recognition), you MUST load two files:

The Model: A quantized file (e.g., Youtu-VL-4B-Instruct-Q4_K_M.gguf)
The Vision Projector: The file named mmproj-Youtu-VL-4b-Instruct-BF16.gguf (included in this repo).

🚀 Usage Instructions

1. llama.cpp (CLI)

You must use the --mmproj flag to point to the vision file.

./llama-cli -m Youtu-VL-4B-Instruct-Q4_K_M.gguf \
  --mmproj mmproj-Youtu-VL-4b-Instruct-BF16.gguf \
  --image your_image.jpg \
  -p "Describe this image detailedly." \
  -n 512 \
  --temp 0.1

Downloads last month: 109

GGUF

Model size

5B params

Architecture

deepseek2

Hardware compatibility

3-bit

4-bit

5-bit

6-bit

8-bit

Model tree for Abiray/Youtu-VL-4B-Instruct-GGUF

Base model

tencent/Youtu-VL-4B-Instruct

Quantized

(4)

this model