qualcomm_meta-llama-3.1-8B-Instruct

This repository contains the Meta-Llama-3.1-8B-Instruct model optimized for Qualcomm hardware using Qualcomm® AI Engine Direct (QNN).

It is designed for high-performance inference on edge devices powered by Qualcomm Snapdragon platforms, enabling efficient on-device AI capabilities with low latency and reduced power consumption.

Model Details

  • Developed by: Advantech-EIOT / Meta
  • Architecture: Llama 3.1
  • Task: Text Generation (Chat/Instruction)
  • Precision: Quantized (w4a16) for NPU optimization
  • Optimization: Qualcomm® AI Stack / QNN SDK

Hardware Compatibility

This model is highly optimized for Advantech Edge AI platforms powered by Qualcomm processors:

  • Windows: Snapdragon® X Elite (e.g. Snapdragon® based Microsoft Surface Pro)
  • Linux: Dragonwing® Platforms (e.g. Dragonwing® IQ-9075)

Limitations and Disclaimer

Llama 3.1 is a powerful language model but may exhibit hallucinations or generate inaccurate information.

  • Accuracy: Users should validate outputs for critical applications.
  • Usage: Please refer to the Meta Llama 3.1 Community License for usage restrictions and acceptable use policies.
  • Edge Optimization: Inference performance may vary depending on the specific hardware configuration and thermal constraints of the edge device.
Downloads last month
40
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Advantech-EIOT/qualcomm_meta-llama-3.1-8B-Instruct

Finetuned
(2470)
this model

Collection including Advantech-EIOT/qualcomm_meta-llama-3.1-8B-Instruct