Update readme, support llama.cpp
Browse files
README.md
CHANGED
|
@@ -383,6 +383,11 @@ print(res)
|
|
| 383 |
|
| 384 |
Please look at [GitHub](https://github.com/OpenBMB/MiniCPM-V) for more detail about usage.
|
| 385 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 386 |
## Int4 quantized version
|
| 387 |
Download the int4 quantized version for lower GPU memory (8GB) usage: [MiniCPM-Llama3-V-2_5-int4](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4).
|
| 388 |
|
|
|
|
| 383 |
|
| 384 |
Please look at [GitHub](https://github.com/OpenBMB/MiniCPM-V) for more detail about usage.
|
| 385 |
|
| 386 |
+
|
| 387 |
+
## Inference with llama.cpp<a id="llamacpp"></a>
|
| 388 |
+
MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of [llama.cpp](https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv) for more detail.
|
| 389 |
+
|
| 390 |
+
|
| 391 |
## Int4 quantized version
|
| 392 |
Download the int4 quantized version for lower GPU memory (8GB) usage: [MiniCPM-Llama3-V-2_5-int4](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-int4).
|
| 393 |
|