Update README.md
Browse files
README.md
CHANGED
|
@@ -28,10 +28,20 @@ tags:
|
|
| 28 |
|
| 29 |
## Run with LlamaEdge
|
| 30 |
|
| 31 |
-
- LlamaEdge version:
|
| 32 |
|
| 33 |
- Context size: `384`
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
## Quantized GGUF Models
|
| 36 |
|
| 37 |
| Name | Quant method | Bits | Size | Use case |
|
|
|
|
| 28 |
|
| 29 |
## Run with LlamaEdge
|
| 30 |
|
| 31 |
+
- LlamaEdge version: [v0.8.2](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.8.2) and above
|
| 32 |
|
| 33 |
- Context size: `384`
|
| 34 |
|
| 35 |
+
- Run as LlamaEdge service
|
| 36 |
+
|
| 37 |
+
```bash
|
| 38 |
+
wasmedge --dir .:. --nn-preload default:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
|
| 39 |
+
llama-api-server.wasm \
|
| 40 |
+
--prompt-template llama-2-chat \
|
| 41 |
+
--ctx-size 384 \
|
| 42 |
+
--model-name all-MiniLM-L6-v2
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
## Quantized GGUF Models
|
| 46 |
|
| 47 |
| Name | Quant method | Bits | Size | Use case |
|