File size: 3,953 Bytes
a227c91 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | # MindSpore Models
## Introduction
MindSpore is a high-performance AI framework optimized for Ascend NPUs. This doc guides users to run MindSpore models in SGLang.
## Requirements
MindSpore currently only supports Ascend NPU devices. Users need to first install Ascend CANN software packages.
The CANN software packages can be downloaded from the [Ascend Official Website](https://www.hiascend.com). The recommended version is 8.3.RC2.
## Supported Models
Currently, the following models are supported:
- **Qwen3**: Dense and MoE models
- **DeepSeek V3/R1**
- *More models coming soon...*
## Installation
> **Note**: Currently, MindSpore models are provided by an independent package `sgl-mindspore`. Support for MindSpore is built upon current SGLang support for Ascend NPU platform. Please first [install SGLang for Ascend NPU](ascend_npu.md) and then install `sgl-mindspore`:
```shell
git clone https://github.com/mindspore-lab/sgl-mindspore.git
cd sgl-mindspore
pip install -e .
```
## Run Model
Current SGLang-MindSpore supports Qwen3 and DeepSeek V3/R1 models. This doc uses Qwen3-8B as an example.
### Offline infer
Use the following script for offline infer:
```python
import sglang as sgl
# Initialize the engine with MindSpore backend
llm = sgl.Engine(
model_path="/path/to/your/model", # Local model path
device="npu", # Use NPU device
model_impl="mindspore", # MindSpore implementation
attention_backend="ascend", # Attention backend
tp_size=1, # Tensor parallelism size
dp_size=1 # Data parallelism size
)
# Generate text
prompts = [
"Hello, my name is",
"The capital of France is",
"The future of AI is"
]
sampling_params = {"temperature": 0, "top_p": 0.9}
outputs = llm.generate(prompts, sampling_params)
for prompt, output in zip(prompts, outputs):
print(f"Prompt: {prompt}")
print(f"Generated: {output['text']}")
print("---")
```
### Start server
Launch a server with MindSpore backend:
```bash
# Basic server startup
python3 -m sglang.launch_server \
--model-path /path/to/your/model \
--host 0.0.0.0 \
--device npu \
--model-impl mindspore \
--attention-backend ascend \
--tp-size 1 \
--dp-size 1
```
For distributed server with multiple nodes:
```bash
# Multi-node distributed server
python3 -m sglang.launch_server \
--model-path /path/to/your/model \
--host 0.0.0.0 \
--device npu \
--model-impl mindspore \
--attention-backend ascend \
--dist-init-addr 127.0.0.1:29500 \
--nnodes 2 \
--node-rank 0 \
--tp-size 4 \
--dp-size 2
```
## Troubleshooting
#### Debug Mode
Enable sglang debug logging by log-level argument.
```bash
python3 -m sglang.launch_server \
--model-path /path/to/your/model \
--host 0.0.0.0 \
--device npu \
--model-impl mindspore \
--attention-backend ascend \
--log-level DEBUG
```
Enable mindspore info and debug logging by setting environments.
```bash
export GLOG_v=1 # INFO
export GLOG_v=0 # DEBUG
```
#### Explicitly select devices
Use the following environment variable to explicitly select the devices to use.
```shell
export ASCEND_RT_VISIBLE_DEVICES=4,5,6,7 # to set device
```
#### Some communication environment issues
In case of some environment with special communication environment, users need set some environment variables.
```shell
export MS_ENABLE_LCCL=off # current not support LCCL communication mode in SGLang-MindSpore
```
#### Some dependencies of protobuf
In case of some environment with special protobuf version, users need set some environment variables to avoid binary version mismatch.
```shell
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python # to avoid protobuf binary version mismatch
```
## Support
For MindSpore-specific issues:
- Refer to the [MindSpore documentation](https://www.mindspore.cn/)
|