NPU - OpenVINO
Collection
leading models optimized for OpenVINO NPU
•
24 items
•
Updated
llama-3.2-3b-instruct-npu-ov is an OpenVino int4 quantized NPU optimized version of Llama 3.2 3B Instruct, providing a very small, very fast inference implementation, optimized for AI PCs using Intel NPU.
llama-3.2-3b-instruct is a new 3B chat foundation model from Meta.
Base model
meta-llama/Llama-3.2-3B-Instruct