MiniCPM-V-4 OpenVINO INT4 (Preserved FP16 Vision)
This repository contains the OpenVINO IR format of the openbmb/MiniCPM-V-4 model, heavily optimized for CPU inference.
Optimization Details
To ensure optimal performance without sacrificing multimodal capabilities, a selective quantization strategy was applied:
- Language Model (LLM): Quantized to INT4 (Asymmetric, Group Size 128) for maximum memory efficiency and CPU inference speed.
- Vision Encoder & Resampler: Preserved in FP16. This surgical approach ensures that the model's visual acuity (including text recognition, UI elements, and fine details) remains identical to the original unquantized model.
Hardware Targeting
This format is designed to run efficiently on standard CPUs, fully utilizing AVX-512 and VNNI instruction sets (e.g., Intel Ice Lake and newer) without requiring dedicated GPU hardware.
- Downloads last month
- 39
Model tree for HyX3/MiniCPM-V-4-OpenVINO-INT4
Base model
openbmb/MiniCPM-V-4