Vintern-1B-v3.5 GGUF

Description

This repository contains GGUF quantized weights for Vintern-1B-v3.5, developed by 5CD-AI.

Vintern-1B-v3.5 is a state-of-the-art Multimodal Large Language Model (MLLM) optimized for the Vietnamese language. Despite its compact size of 1 billion parameters, it demonstrates exceptional performance in document understanding, OCR, and detailed image description, making it ideal for edge computing and local deployment.

Key Improvements in v3.5

  • Superior Vietnamese Support: Fine-tuned to understand Vietnamese cultural nuances and complex linguistic structures.
  • Efficient Architecture: Based on a lightweight backbone, offering high-speed inference without compromising visual reasoning.
  • GGUF Compatibility: Optimized for local execution via llama.cpp, LM Studio, and other GGUF-supported ecosystems.

Available Quantization Methods

File Quantization Description
vintern-1b-v3.5-q4_k_m.gguf Q4_K_M Recommended. Balanced performance and accuracy.
vintern-1b-v3.5-q8_0.gguf Q8_0 High precision, near-original quality but larger file size.
vintern-1b-v3.5-q5_k_m.gguf Q5_K_M Better accuracy than Q4 with a slight increase in RAM usage.

How to Use

1. Requirements

To run this Vision-Language Model, you need both the Main GGUF file and the Multi-Modal Projector (mmproj) file.

2. Using llama.cpp

Download the latest version of llama.cpp, following instruction from Llama.cpp's GitHub and run:

./llama-cli -m vintern-1b-v3.5-q4_k_m.gguf \
  --mmproj vintern-1b-v3.5-mmproj-f16.gguf \
  --image path/to/your/image.jpg \
  -p "Describe this image in detail in Vietnamese."
Downloads last month
39
GGUF
Model size
0.6B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support