Celeste Imperia | Qwen2-VL-2B-Instruct (OpenVINO INT4)

This is a hardware-optimized port of Alibaba's Qwen2-VL-2B-Instruct, compressed using Intel's NNCF (Neural Network Compression Framework) for low-latency execution on CPUs, GPUs, and NPUs.

πŸ› οΈ Optimization Details

  • Precision: Mixed INT4/INT8 (Asymmetric)
  • Framework: OpenVINO 2024.x
  • Task: image-text-to-text (Multimodal Reasoning)
  • Validation Rig: Intel i5-11400 / 40GB RAM

πŸš€ Key Features

  • Native support for high-resolution image reasoning.
  • Optimized for Snapdragon X Elite (via OpenVINO ARM64) and Intel Core Ultra.
  • Drastically reduced memory footprint (~1.8GB) compared to the FP16 original.

πŸ’» Usage

Designed for use with the optimum-intel library.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support