vibestudio-HQ commited on
Commit
57c0d1e
·
verified ·
1 Parent(s): d5e8c7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md CHANGED
@@ -160,7 +160,67 @@ We, over-caffinated researchers at VibeStud.io wanted to create a 50% pruned ver
160
  | Clinical Knowledge | 92.83% | 85.66% | \-7.17% | ⚠️ Moderate Drop |
161
 
162
  ---
 
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  ## Benchmarks
165
 
166
  Coming soon.
 
160
  | Clinical Knowledge | 92.83% | 85.66% | \-7.17% | ⚠️ Moderate Drop |
161
 
162
  ---
163
+ ## **Deployment with Python**
164
 
165
+ It is recommended to use a virtual environment (such as **venv**, **conda**, or **uv**) to avoid dependency conflicts.
166
+
167
+ We recommend installing SGLang in a fresh Python environment:
168
+
169
+ ```shell
170
+ git clone -b v0.5.4.post1 https://github.com/sgl-project/sglang.git
171
+ cd sglang
172
+
173
+ # Install the python packages
174
+ pip install --upgrade pip
175
+ pip install -e "python"
176
+ ```
177
+
178
+ Run the following command to start the SGLang server. SGLang will automatically download and cache the MiniMax-M2 model from Hugging Face.
179
+
180
+ **4-GPU deployment command:**
181
+
182
+ ```shell
183
+ python -m sglang.launch_server \
184
+ --model-path MiniMaxAI/MiniMax-M2 \
185
+ --tp-size 4 \
186
+ --tool-call-parser minimax-m2 \
187
+ --reasoning-parser minimax-append-think \
188
+ --host 0.0.0.0 \
189
+ --trust-remote-code \
190
+ --port 8000 \
191
+ --mem-fraction-static 0.85
192
+ ```
193
+
194
+ **8-GPU deployment command:**
195
+
196
+ ```shell
197
+ python -m sglang.launch_server \
198
+ --model-path MiniMaxAI/MiniMax-M2 \
199
+ --tp-size 8 \
200
+ --ep-size 8 \
201
+ --tool-call-parser minimax-m2 \
202
+ --trust-remote-code \
203
+ --host 0.0.0.0 \
204
+ --reasoning-parser minimax-append-think \
205
+ --port 8000 \
206
+ --mem-fraction-static 0.85
207
+ ```
208
+
209
+ ## **Testing Deployment**
210
+
211
+ After startup, you can test the SGLang OpenAI-compatible API with the following command:
212
+
213
+ ```shell
214
+ curl http://localhost:8000/v1/chat/completions \
215
+ -H "Content-Type: application/json" \
216
+ -d '{
217
+ "model": "MiniMaxAI/MiniMax-M2",
218
+ "messages": [
219
+ {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
220
+ {"role": "user", "content": [{"type": "text", "text": "Who won the world series in 2020?"}]}
221
+ ]
222
+ }'
223
+ ```
224
  ## Benchmarks
225
 
226
  Coming soon.