BubbleQ commited on
Commit
78ab107
·
verified ·
1 Parent(s): a9196a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -7
README.md CHANGED
@@ -13,9 +13,7 @@ library_name: transformers
13
  <div align="center">
14
  <img src="figures/klear-logo-02.png" width="500"/>
15
  <p>
16
- 🤗 <a href="https://huggingface.co/Kwai-Klear">Hugging Face</a> | 📑 <a href="">Technique Report</a>
17
- <br>
18
- 🖥️ <a href="https://kml-dtmachine-15498-prod-1.kmlhb2az1l3-2.corp.kuaishou.com">Chat with Klear</a> | 💬 <a href="https://github.com/Kwai-Klear">Issues & Discussions</a>
19
  </p>
20
  </div>
21
 
@@ -181,13 +179,13 @@ print(result)
181
 
182
  ### Inference with vllm
183
 
184
- [vLLM](https://github.com/vllm-project/vllm) is a high-speed and memery-efficicent inference framework. We provide our own forked version of [vLLM](https://github.com/vllm-project/vllm) here.
185
 
186
  ```shell
187
- git clone
188
  cd vllm
189
- pip install
190
- vllm serve Klear-46B-A2.5B-inst --port 8000 --tensor-parallel-size 8 --trust-remote-code
191
  ```
192
 
193
  An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
 
13
  <div align="center">
14
  <img src="figures/klear-logo-02.png" width="500"/>
15
  <p>
16
+ 🤗 <a href="https://huggingface.co/Kwai-Klear">Hugging Face</a> | 📑 <a href="">Technique Report</a> 💬 <a href="https://github.com/Kwai-Klear">Issues & Discussions</a>
 
 
17
  </p>
18
  </div>
19
 
 
179
 
180
  ### Inference with vllm
181
 
182
+ [vLLM](https://github.com/vllm-project/vllm) is a high-speed and memery-efficicent inference framework. We provide our own forked version of [vLLM](https://github.com/Kwai-Klear/vllm) here.
183
 
184
  ```shell
185
+ git clone https://github.com/Kwai-Klear/vllm.git
186
  cd vllm
187
+ VLLM_USE_PRECOMPILED=1 pip install --editable .
188
+ vllm serve /path/to/Klear-Inst. --port 8000 --tensor-parallel-size 8 --trust-remote-code
189
  ```
190
 
191
  An OpenAI-compatible API will be available at `http://localhost:8000/v1`.