Kaisheng He commited on
Commit ·
344edc2
1
Parent(s): 400b21e
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,50 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
+
## chatglm3-ggml
|
| 5 |
+
|
| 6 |
+
This repo contains GGML format model files for chatglm3-6B.
|
| 7 |
+
|
| 8 |
+
### Example code
|
| 9 |
+
|
| 10 |
+
#### Install packages
|
| 11 |
+
```bash
|
| 12 |
+
pip install xinference[ggml]>=0.4.3
|
| 13 |
+
```
|
| 14 |
+
If you want to run with GPU acceleration, refer to [installation](https://github.com/xorbitsai/inference#installation).
|
| 15 |
+
|
| 16 |
+
#### Start a local instance of Xinference
|
| 17 |
+
```bash
|
| 18 |
+
xinference -p 9997
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
#### Launch and inference
|
| 22 |
+
```python
|
| 23 |
+
from xinference.client import Client
|
| 24 |
+
|
| 25 |
+
client = Client("http://localhost:9997")
|
| 26 |
+
model_uid = client.launch_model(
|
| 27 |
+
model_name="chatglm3",
|
| 28 |
+
model_format="ggmlv3",
|
| 29 |
+
model_size_in_billions=6,
|
| 30 |
+
quantization="q4_0",
|
| 31 |
+
)
|
| 32 |
+
model = client.get_model(model_uid)
|
| 33 |
+
|
| 34 |
+
chat_history = []
|
| 35 |
+
prompt = "最大的动物是什么?"
|
| 36 |
+
model.chat(
|
| 37 |
+
prompt,
|
| 38 |
+
chat_history,
|
| 39 |
+
generate_config={"max_tokens": 1024}
|
| 40 |
+
)
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
### More information
|
| 44 |
+
|
| 45 |
+
[Xinference](https://github.com/xorbitsai/inference) Replace OpenAI GPT with another LLM in your app
|
| 46 |
+
by changing a single line of code. Xinference gives you the freedom to use any LLM you need.
|
| 47 |
+
With Xinference, you are empowered to run inference with any open-source language models,
|
| 48 |
+
speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
|
| 49 |
+
|
| 50 |
+
<i><a href="https://join.slack.com/t/xorbitsio/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA">👉 Join our Slack community!</a></i>
|