Hyper-AI commited on
Commit
10dc55d
·
verified ·
1 Parent(s): 38a7581

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -1
README.md CHANGED
@@ -16,8 +16,22 @@ tags:
16
  **59G -> 32G memory decrease**
17
 
18
  **speedup 30%**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- **vllm serve can run**
21
 
22
  <div align="center">
23
  <img src=https://ai.google.dev/gemma/images/gemma4_banner.png>
 
16
  **59G -> 32G memory decrease**
17
 
18
  **speedup 30%**
19
+
20
+ **Start the vLLM server**
21
+
22
+ vllm serve Hyper-AI/gemma-4-31B-it-fp8 --max-model-len 32768
23
+
24
+ **To enable thinking/reasoning and tool calling:**
25
+
26
+ vllm serve Hyper-AI/gemma-4-31B-it-fp8 \
27
+ --max-model-len 32768 \
28
+ --reasoning-parser gemma4 \
29
+ --tool-call-parser gemma4 \
30
+ --enable-auto-tool-choice
31
+
32
+
33
+
34
 
 
35
 
36
  <div align="center">
37
  <img src=https://ai.google.dev/gemma/images/gemma4_banner.png>