Update README.md
Browse files
README.md
CHANGED
|
@@ -95,6 +95,10 @@ for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
|
|
| 95 |
|
| 96 |
## Deployment
|
| 97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
### LMDeploy
|
| 99 |
|
| 100 |
LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by the MMRazor and MMDeploy teams.
|
|
|
|
| 95 |
|
| 96 |
## Deployment
|
| 97 |
|
| 98 |
+
### llama.cpp
|
| 99 |
+
|
| 100 |
+
[internlm/internlm2_5-20b-chat-gguf](https://huggingface.co/internlm/internlm2_5-20b-chat-gguf) offers `internlm2_5-20b-chat` models in GGUF format in both half precision and various low-bit quantized versions, including `q5_0`, `q5_k_m`, `q6_k`, and `q8_0`.
|
| 101 |
+
|
| 102 |
### LMDeploy
|
| 103 |
|
| 104 |
LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by the MMRazor and MMDeploy teams.
|