Kwai-Klear
/

Klear-46B-A2.5B-Base

Text Generation

Model card Files Files and versions

BubbleQ commited on Sep 4, 2025

Commit

78ab107

·

verified ·

1 Parent(s): a9196a0

Update README.md

Files changed (1) hide show

README.md +5 -7

README.md CHANGED Viewed

@@ -13,9 +13,7 @@ library_name: transformers
 <div align="center">
   <img src="figures/klear-logo-02.png" width="500"/>
   <p>
-    🤗 <a href="https://huggingface.co/Kwai-Klear">Hugging Face</a> |  📑 <a href="">Technique Report</a>
-  <br>
-  🖥️ <a href="https://kml-dtmachine-15498-prod-1.kmlhb2az1l3-2.corp.kuaishou.com">Chat with Klear</a> | 💬 <a href="https://github.com/Kwai-Klear">Issues & Discussions</a>
   </p>
 </div>
@@ -181,13 +179,13 @@ print(result)
 ### Inference with vllm
-[vLLM](https://github.com/vllm-project/vllm) is a high-speed and memery-efficicent inference framework. We provide our own forked version of [vLLM](https://github.com/vllm-project/vllm) here.
 ```shell
-git clone
 cd vllm
-pip install
-vllm serve Klear-46B-A2.5B-inst --port 8000 --tensor-parallel-size 8 --trust-remote-code
 ```
 An OpenAI-compatible API will be available at `http://localhost:8000/v1`.

 <div align="center">
   <img src="figures/klear-logo-02.png" width="500"/>
   <p>
+    🤗 <a href="https://huggingface.co/Kwai-Klear">Hugging Face</a> |  📑 <a href="">Technique Report</a> 💬 <a href="https://github.com/Kwai-Klear">Issues & Discussions</a>
   </p>
 </div>
 ### Inference with vllm
+[vLLM](https://github.com/vllm-project/vllm) is a high-speed and memery-efficicent inference framework. We provide our own forked version of [vLLM](https://github.com/Kwai-Klear/vllm) here.
 ```shell
+git clone https://github.com/Kwai-Klear/vllm.git
 cd vllm
+VLLM_USE_PRECOMPILED=1 pip install --editable .
+vllm serve /path/to/Klear-Inst. --port 8000 --tensor-parallel-size 8 --trust-remote-code
 ```
 An OpenAI-compatible API will be available at `http://localhost:8000/v1`.