Youtu-VL-4B-Instruct-GGUF

This repository contains GGUF format model files for Tencent's Youtu-VL-4B-Instruct.

These files are compatible with llama.cpp, LM Studio, and other tools that support GGUF.

⚠️ CRITICAL: How to Use (Vision Capabilities)

This is a Vision-Language Model. To use its "eyes" (image recognition), you MUST load two files:

  1. The Model: A quantized file (e.g., Youtu-VL-4B-Instruct-Q4_K_M.gguf)
  2. The Vision Projector: The file named mmproj-Youtu-VL-4b-Instruct-BF16.gguf (included in this repo).

πŸš€ Usage Instructions

1. llama.cpp (CLI)

You must use the --mmproj flag to point to the vision file.

./llama-cli -m Youtu-VL-4B-Instruct-Q4_K_M.gguf \
  --mmproj mmproj-Youtu-VL-4b-Instruct-BF16.gguf \
  --image your_image.jpg \
  -p "Describe this image detailedly." \
  -n 512 \
  --temp 0.1
Downloads last month
109
GGUF
Model size
5B params
Architecture
deepseek2
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Abiray/Youtu-VL-4B-Instruct-GGUF

Quantized
(4)
this model