[Docs] Add LightLLM deployment example

#57
by FubaoSu - opened
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -156,6 +156,29 @@ python3 -m sglang.launch_server \
156
  ```
157
  + For Blackwell GPUs, include `--attention-backend triton --speculative-draft-attention-backend triton` in your SGLang launch command.
158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
159
  ## Citation
160
 
161
  If you find our work useful in your research, please consider citing the following paper:
 
156
  ```
157
  + For Blackwell GPUs, include `--attention-backend triton --speculative-draft-attention-backend triton` in your SGLang launch command.
158
 
159
+ ### LightLLM
160
+ ```shell
161
+ # Install lightllm (Recommended to use Docker).
162
+ docker pull jyily/lightllm:cu129-78cc66a
163
+
164
+ # Or build from [source](https://github.com/ModelTC/LightLLM)
165
+ pip install git+https://github.com/ModelTC/LightLLM
166
+
167
+ LIGHTLLM_TRITON_AUTOTUNE_LEVEL=1 LOADWORKER=18 \
168
+ python3 -m lightllm.server.api_server \
169
+ --model_dir /path/to/GLM-4.7-Flash/ \
170
+ --tp 1 \
171
+ --max_req_total_len 202752 \
172
+ --chunked_prefill_size 8192 \
173
+ --llm_prefill_att_backend fa3 \
174
+ --llm_decode_att_backend fa3 \
175
+ --graph_max_batch_size 512 \
176
+ --tool_call_parser glm47 \
177
+ --reasoning_parser glm45 \
178
+ --host 0.0.0.0 \
179
+ --port 8000
180
+ ```
181
+
182
  ## Citation
183
 
184
  If you find our work useful in your research, please consider citing the following paper: