Support Models on Ascend NPU
This section describes the models supported on the Ascend NPU, including Large Language Models, Multimodal Language Models, Embedding Models, Reward Models and Rerank Models. Mainstream DeepSeek/Qwen/GLM series are included. You are welcome to enable various models based on your business requirements.
Large Language Models
| Models | Model Family | A2 Supported | A3 Supported |
|---|---|---|---|
| DeepSeek V3/V3.1 | DeepSeek | β | β |
| DeepSeek-V3.2-Exp-W8A8 | DeepSeek | β | β |
| DeepSeek-R1-0528-W8A8 | DeepSeek | β | β |
| DeepSeek-V2-Lite-W8A8 | DeepSeek | β | β |
| Qwen/Qwen3-30B-A3B-Instruct-2507 | Qwen | β | β |
| Qwen/Qwen3-32B | Qwen | β | β |
| Qwen/Qwen3-0.6B | Qwen | β | β |
| Qwen3-235B-A22B-W8A8 | Qwen | β | β |
| Qwen/Qwen3-Next-80B-A3B-Instruct | Qwen | β | β |
| Qwen3-Coder-480B-A35B-Instruct-w8a8-QuaRot | Qwen | β | β |
| Qwen/Qwen2.5-7B-Instruct | Qwen | β | β |
| QWQ-32B-W8A8 | Qwen | β | β |
| meta-llama/Llama-4-Scout-17B-16E-Instruct | Llama | β | β |
| AI-ModelScope/Llama-3.1-8B-Instruct | Llama | β | β |
| LLM-Research/llama-2-7b | Llama | β | β |
| LLM-Research/Llama-3.2-1B-Instruct | Llama | β | β |
| mistralai/Mistral-7B-Instruct-v0.2 | Mistral | β | β |
| google/gemma-3-4b-it | Gemma | β | β |
| microsoft/Phi-4-multimodal-instruct | Phi | β | β |
| allenai/OLMoE-1B-7B-0924 | OLMoE | β | β |
| stabilityai/stablelm-2-1_6b | StableLM | β | β |
| CohereForAI/c4ai-command-r-v01 | Command-R | β | β |
| huihui-ai/grok-2 | Grok | β | β |
| ZhipuAI/chatglm2-6b | ChatGLM | β | β |
| Shanghai_AI_Laboratory/internlm2-7b | InternLM 2 | β | β |
| LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct | ExaONE 3 | β | β |
| xverse/XVERSE-MoE-A36B | XVERSE | β | β |
| HuggingFaceTB/SmolLM-1.7B | SmolLM | β | β |
| ZhipuAI/glm-4-9b-chat | GLM-4 | β | β |
| XiaomiMiMo/MiMo-7B-RL | MiMo | β | β |
| arcee-ai/AFM-4.5B-Base | Arcee AFM-4.5B | β | β |
| Howeee/persimmon-8b-chat | Persimmon | β | β |
| inclusionAI/Ling-lite | Ling | β | β |
| ibm-granite/granite-3.1-8b-instruct | Granite | β | β |
| ibm-granite/granite-3.0-3b-a800m-instruct | Granite MoE | β | β |
| AI-ModelScope/dbrx-instruct | DBRX (Databricks) | β | β |
| baichuan-inc/Baichuan2-13B-Chat | Baichuan 2 (7B, 13B) | β | β |
| baidu/ERNIE-4.5-21B-A3B-PT | ERNIE-4.5 (4.5, 4.5MoE series) | β | β |
| OpenBMB/MiniCPM3-4B | MiniCPM (v3, 4B) | β | β |
| Kimi/Kimi-K2-Thinking | Kimi | β | β |
| openai/gpt-oss-120b | GPTOSS | β | β |
| allenai/OLMo-2-1124-7B-Instruct | OLMo | β | β |
| minimax/MiniMax-M2 | MiniMax-M2 | β | β |
| upstage/SOLAR-10.7B-Instruct-v1.0 | Solar | β | β |
| bigcode/starcoder2-7b | StarCoder2 | β | β |
| arcee-ai/Trinity-Mini | Trinity (Nano, Mini) | β | β |
Multimodal Language Models
| Models | Model Family (Variants) | A2 Supported | A3 Supported |
|---|---|---|---|
| Qwen/Qwen2.5-VL-3B-Instruct | Qwen-VL | β | β |
| Qwen/Qwen2.5-VL-72B-Instruct | Qwen-VL | β | β |
| Qwen/Qwen3-VL-30B-A3B-Instruct | Qwen-VL | β | β |
| Qwen/Qwen3-VL-8B-Instruct | Qwen-VL | β | β |
| Qwen/Qwen3-VL-4B-Instruct | Qwen-VL | β | β |
| Qwen/Qwen3-VL-235B-A22B-Instruct | Qwen-VL | β | β |
| deepseek-ai/deepseek-vl2 | DeepSeek-VL2 | β | β |
| deepseek-ai/Janus-Pro-1B | Janus-Pro (1B, 7B) | β | β |
| deepseek-ai/Janus-Pro-7B | Janus-Pro (1B, 7B) | β | β |
| openbmb/MiniCPM-V-2_6 | MiniCPM-V / MiniCPM-o | β | β |
| openbmb/MiniCPM-o-2_6 | MiniCPM-V / MiniCPM-o | β | β |
| google/gemma-3-4b-it | Gemma 3 (Multimodal) | β | β |
| mistralai/Mistral-Small-3.1-24B-Instruct-2503 | Mistral-Small-3.1-24B | β | β |
| microsoft/Phi-4-multimodal-instruct | Phi-4-multimodal-instruct | β | β |
| XiaomiMiMo/MiMo-VL-7B-RL | MiMo-VL (7B) | β | β |
| AI-ModelScope/llava-v1.6-34b | LLaVA (v1.5 & v1.6) | β | β |
| lmms-lab/llava-next-72b | LLaVA-NeXT (8B, 72B) | β | β |
| lmms-lab/llava-onevision-qwen2-7b-ov | LLaVA-OneVision | β | β |
| Kimi/Kimi-VL-A3B-Instruct | Kimi-VL (A3B) | β | β |
| ZhipuAI/GLM-4.5V | GLM-4.5V (106B) | β | β |
| LLM-Research/Llama-3.2-11B-Vision-Instruct | Llama 3.2 Vision (11B) | β | β |
| rednote-hilab/dots.ocr | DotsVLM-OCR | β | β |
Embedding Models
| Models | Model Family | A2 Supported | A3 Supported |
|---|---|---|---|
| intfloat/e5-mistral-7b-instruct | E5 (Llama/Mistral based) | β | β |
| iic/gte_Qwen2-1.5B-instruct | GTE-Qwen2 | β | β |
| Qwen/Qwen3-Embedding-8B | Qwen3-Embedding | β | β |
| Alibaba-NLP/gme-Qwen2-VL-2B-Instruct | GME (Multimodal) | β | β |
| AI-ModelScope/clip-vit-large-patch14-336 | CLIP | β | β |
| BAAI/bge-large-en-v1.5 | BGE | β | β |
Reward Models
| Models | Model Family | A2 Supported | A3 Supported |
|---|---|---|---|
| Skywork/Skywork-Reward-Llama-3.1-8B-v0.2 | Llama3.1 Reward | β | β |
| Shanghai_AI_Laboratory/internlm2-7b-reward | InternLM 2 Reward | β | β |
| Qwen/Qwen2.5-Math-RM-72B | Qwen2.5 Reward - Math | β | β |
| Howeee/Qwen2.5-1.5B-apeach | Qwen2.5 Reward - Sequence | β | β |
| AI-ModelScope/Skywork-Reward-Gemma-2-27B-v0.2 | Gemma 2-27B Reward | β | β |
Rerank Models
| Models | Model Family | A2 Supported | A3 Supported |
|---|---|---|---|
| BAAI/bge-reranker-v2-m3 | BGE-Reranker | β | β |