MiniCPM-V-4 OpenVINO INT4 (Preserved FP16 Vision)

This repository contains the OpenVINO IR format of the openbmb/MiniCPM-V-4 model, heavily optimized for CPU inference.

Optimization Details

To ensure optimal performance without sacrificing multimodal capabilities, a selective quantization strategy was applied:

  • Language Model (LLM): Quantized to INT4 (Asymmetric, Group Size 128) for maximum memory efficiency and CPU inference speed.
  • Vision Encoder & Resampler: Preserved in FP16. This surgical approach ensures that the model's visual acuity (including text recognition, UI elements, and fine details) remains identical to the original unquantized model.

Hardware Targeting

This format is designed to run efficiently on standard CPUs, fully utilizing AVX-512 and VNNI instruction sets (e.g., Intel Ice Lake and newer) without requiring dedicated GPU hardware.

Downloads last month
39
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HyX3/MiniCPM-V-4-OpenVINO-INT4

Quantized
(4)
this model