Celeste Imperia | Qwen2-VL-2B-Instruct (OpenVINO INT4)
This is a hardware-optimized port of Alibaba's Qwen2-VL-2B-Instruct, compressed using Intel's NNCF (Neural Network Compression Framework) for low-latency execution on CPUs, GPUs, and NPUs.
π οΈ Optimization Details
- Precision: Mixed INT4/INT8 (Asymmetric)
- Framework: OpenVINO 2024.x
- Task: image-text-to-text (Multimodal Reasoning)
- Validation Rig: Intel i5-11400 / 40GB RAM
π Key Features
- Native support for high-resolution image reasoning.
- Optimized for Snapdragon X Elite (via OpenVINO ARM64) and Intel Core Ultra.
- Drastically reduced memory footprint (~1.8GB) compared to the FP16 original.
π» Usage
Designed for use with the optimum-intel library.
- Downloads last month
- 13
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support