trillionlabs
/

Tri-21B-Think

Text Generation

Model card Files Files and versions

WonsukYangTL commited on 15 days ago

Commit

4249bed

·

verified ·

1 Parent(s): 63b023e

Update README.md

Files changed (1) hide show

README.md +2 -29

README.md CHANGED Viewed

@@ -86,36 +86,9 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 print(response)
 ```
-### vLLM Deployment
-```bash
-vllm serve trillionlabs/Tri-21B-Think \
-    --dtype bfloat16 \
-    --max-model-len 32768 \
-    --tensor-parallel-size 8 \
-    --reasoning-parser qwen3 \
-    --enable-auto-tool-choice \
-    --tool-call-parser hermes
-```
-#### Long Context (up to 262K) with YaRN
-```bash
-vllm serve trillionlabs/Tri-21B-Think \
-    --dtype bfloat16 \
-    --max-model-len 262144 \
-    --tensor-parallel-size 8 \
-    --reasoning-parser qwen3 \
-    --enable-auto-tool-choice \
-    --tool-call-parser hermes \
-    --hf-overrides '{"rope_scaling": {"rope_type":"yarn","factor":8.0,"original_max_position_embeddings":32768}}'
-```
-### SGLang Deployment
-```bash
-python3 -m sglang.launch_server --model-path trillionlabs/Tri-21B-Think --dtype bfloat16 --context-length 32768
-```
 ## Fine-tuning Notes

 print(response)
 ```
+### vLLM & SGLang Deployment
+vLLM and SGLang support for Trillion Model is on the way. Stay tuned!
 ## Fine-tuning Notes