RangiLyu commited on
Commit
7acd44d
·
verified ·
1 Parent(s): 15c08d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -18
README.md CHANGED
@@ -201,38 +201,32 @@ You can utilize one of the following LLM inference frameworks to create an OpenA
201
 
202
  #### [lmdeploy(>=0.9.2)](https://github.com/InternLM/lmdeploy)
203
 
204
- ```bash
205
- lmdeploy serve api_server internlm/Intern-S1 --reasoning-parser intern-s1 --tool-call-parser intern-s1 --tp 8
206
  ```
207
 
208
  #### [vllm](https://github.com/vllm-project/vllm)
209
 
210
  ```bash
211
- vllm serve internlm/Intern-S1 --tensor-parallel-size 8 --trust-remote-code
212
  ```
213
 
214
  #### [sglang](https://github.com/sgl-project/sglang)
215
 
 
 
216
  ```bash
217
- python3 -m sglang.launch_server \
218
- --model-path internlm/Intern-S1 \
 
219
  --trust-remote-code \
220
- --tp 8 \
 
 
 
221
  --grammar-backend none
222
  ```
223
 
224
- #### ollama for local deployment:
225
-
226
- ```bash
227
- # install ollama
228
- curl -fsSL https://ollama.com/install.sh | sh
229
- # fetch model
230
- ollama pull internlm/interns1
231
- # run model
232
- ollama run internlm/interns1
233
- # then use openai client to call on http://localhost:11434/v1
234
- ```
235
-
236
  ## Advanced Usage
237
 
238
  ### Tool Calling
 
201
 
202
  #### [lmdeploy(>=0.9.2)](https://github.com/InternLM/lmdeploy)
203
 
204
+ ```
205
+ lmdeploy serve api_server internlm/Intern-S1-FP8 --reasoning-parser intern-s1 --tool-call-parser intern-s1 --tp 4
206
  ```
207
 
208
  #### [vllm](https://github.com/vllm-project/vllm)
209
 
210
  ```bash
211
+ vllm serve internlm/Intern-S1-FP8 --tensor-parallel-size 4 --trust-remote-code
212
  ```
213
 
214
  #### [sglang](https://github.com/sgl-project/sglang)
215
 
216
+ Supporting Intern-S1 with SGLang is still in progress. Please refer to this [PR](https://github.com/sgl-project/sglang/pull/8350).
217
+
218
  ```bash
219
+ CUDA_VISIBLE_DEVICES=0,1,2,3 \
220
+ python3 -m sglang.launch_server \
221
+ --model-path internlm/Intern-S1-FP8 \
222
  --trust-remote-code \
223
+ --tp 4 \
224
+ --port 8001 \
225
+ --mem-fraction-static 0.85 \
226
+ --enable-multimodal \
227
  --grammar-backend none
228
  ```
229
 
 
 
 
 
 
 
 
 
 
 
 
 
230
  ## Advanced Usage
231
 
232
  ### Tool Calling