WonsukYangTL commited on
Commit
4249bed
·
verified ·
1 Parent(s): 63b023e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -29
README.md CHANGED
@@ -86,36 +86,9 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
86
  print(response)
87
  ```
88
 
89
- ### vLLM Deployment
90
-
91
- ```bash
92
- vllm serve trillionlabs/Tri-21B-Think \
93
- --dtype bfloat16 \
94
- --max-model-len 32768 \
95
- --tensor-parallel-size 8 \
96
- --reasoning-parser qwen3 \
97
- --enable-auto-tool-choice \
98
- --tool-call-parser hermes
99
- ```
100
-
101
- #### Long Context (up to 262K) with YaRN
102
-
103
- ```bash
104
- vllm serve trillionlabs/Tri-21B-Think \
105
- --dtype bfloat16 \
106
- --max-model-len 262144 \
107
- --tensor-parallel-size 8 \
108
- --reasoning-parser qwen3 \
109
- --enable-auto-tool-choice \
110
- --tool-call-parser hermes \
111
- --hf-overrides '{"rope_scaling": {"rope_type":"yarn","factor":8.0,"original_max_position_embeddings":32768}}'
112
- ```
113
 
114
- ### SGLang Deployment
115
-
116
- ```bash
117
- python3 -m sglang.launch_server --model-path trillionlabs/Tri-21B-Think --dtype bfloat16 --context-length 32768
118
- ```
119
 
120
 
121
  ## Fine-tuning Notes
 
86
  print(response)
87
  ```
88
 
89
+ ### vLLM & SGLang Deployment
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
+ vLLM and SGLang support for Trillion Model is on the way. Stay tuned!
 
 
 
 
92
 
93
 
94
  ## Fine-tuning Notes