Xin Liu commited on
Commit
6bdae24
·
1 Parent(s): 16fdeb9

Signed-off-by: Xin Liu <sam@secondstate.io>

Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -33,6 +33,8 @@ tags:
33
 
34
  - Prompt template
35
 
 
 
36
  - Prompt string
37
 
38
  ```console
@@ -41,6 +43,35 @@ tags:
41
 
42
  - Context size: `2048`
43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  ## Quantized GGUF Models
45
 
46
  | Name | Quant method | Bits | Size | Use case |
@@ -57,4 +88,6 @@ tags:
57
  | [Octopus-v2-Q5_K_S.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q5_K_S.gguf) | Q5_K_S | 5 | 1.8 GB| large, low quality loss - recommended |
58
  | [Octopus-v2-Q6_K.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q6_K.gguf) | Q6_K | 6 | 2.06 GB| very large, extremely low quality loss |
59
  | [Octopus-v2-Q8_0.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q8_0.gguf) | Q8_0 | 8 | 2.67 GB| very large, extremely low quality loss - not recommended |
60
- | [Octopus-v2-f16.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-f16.gguf) | f16 | 16 | 5.02 GB| |
 
 
 
33
 
34
  - Prompt template
35
 
36
+ - Prompt type: `octopus`
37
+
38
  - Prompt string
39
 
40
  ```console
 
43
 
44
  - Context size: `2048`
45
 
46
+ - Run as LlamaEdge service
47
+
48
+ ```bash
49
+ wasmedge --dir .:. --nn-preload default:GGML:AUTO:Octopus-v2-Q5_K_M.gguf \
50
+ llama-api-server.wasm \
51
+ --prompt-template octopus \
52
+ --ctx-size 2048 \
53
+ --model-name octopus-v2
54
+ ```
55
+
56
+ Example of a user request in json format:
57
+
58
+ ```json
59
+ {
60
+ "messages": [
61
+ {
62
+ "role": "system",
63
+ "content": "Below is the query from the users, please call the correct function and generate the parameters to call the function."
64
+ },
65
+ {
66
+ "role": "user",
67
+ "content": "Take a selfie for me with front camera"
68
+ }
69
+ ],
70
+ "model": "octopus-v2",
71
+ "stream": false
72
+ }
73
+ ```
74
+
75
  ## Quantized GGUF Models
76
 
77
  | Name | Quant method | Bits | Size | Use case |
 
88
  | [Octopus-v2-Q5_K_S.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q5_K_S.gguf) | Q5_K_S | 5 | 1.8 GB| large, low quality loss - recommended |
89
  | [Octopus-v2-Q6_K.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q6_K.gguf) | Q6_K | 6 | 2.06 GB| very large, extremely low quality loss |
90
  | [Octopus-v2-Q8_0.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q8_0.gguf) | Q8_0 | 8 | 2.67 GB| very large, extremely low quality loss - not recommended |
91
+ | [Octopus-v2-f16.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-f16.gguf) | f16 | 16 | 10 GB| |
92
+
93
+ *Quantized with llama.cpp b2589*