NPU - QNN
Collection
leading models optimized for NPU deployment on Qualcomm Snapdragon • 7 items • Updated • 2
qwen2.5-7b-instruct-onnx-qnn is an ONNX QNN int4 quantized version of Qwen2.5-7B-Instruct, providing a very fast inference implementation, optimized for AI PCs using Qualcommm NPU.
This is from the latest release series from Qwen.