second-state
/

Octopus-v2-GGUF

Text Generation

function calling

on-device language model

Model card Files Files and versions

Xin Liu commited on Apr 11, 2024

Commit

6bdae24

·

1 Parent(s): 16fdeb9

Update

Signed-off-by: Xin Liu <sam@secondstate.io>

Files changed (1) hide show

README.md +34 -1

README.md CHANGED Viewed

@@ -33,6 +33,8 @@ tags:
   - Prompt template
     - Prompt string
       ```console
@@ -41,6 +43,35 @@ tags:
 - Context size: `2048`
 ## Quantized GGUF Models
 | Name | Quant method | Bits | Size | Use case |
@@ -57,4 +88,6 @@ tags:
 | [Octopus-v2-Q5_K_S.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q5_K_S.gguf) | Q5_K_S | 5 | 1.8 GB| large, low quality loss - recommended |
 | [Octopus-v2-Q6_K.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q6_K.gguf)     | Q6_K   | 6 | 2.06 GB| very large, extremely low quality loss |
 | [Octopus-v2-Q8_0.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q8_0.gguf)     | Q8_0   | 8 | 2.67 GB| very large, extremely low quality loss - not recommended |
-| [Octopus-v2-f16.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-f16.gguf)     | f16   | 16 | 5.02 GB|  |

   - Prompt template
+    - Prompt type: `octopus`
     - Prompt string
       ```console
 - Context size: `2048`
+- Run as LlamaEdge service
+  ```bash
+  wasmedge --dir .:. --nn-preload default:GGML:AUTO:Octopus-v2-Q5_K_M.gguf \
+    llama-api-server.wasm \
+    --prompt-template octopus \
+    --ctx-size 2048 \
+    --model-name octopus-v2
+  ```
+  Example of a user request in json format:
+  ```json
+  {
+      "messages": [
+          {
+              "role": "system",
+              "content": "Below is the query from the users, please call the correct function and generate the parameters to call the function."
+          },
+          {
+              "role": "user",
+              "content": "Take a selfie for me with front camera"
+          }
+      ],
+      "model": "octopus-v2",
+      "stream": false
+  }
+  ```
 ## Quantized GGUF Models
 | Name | Quant method | Bits | Size | Use case |
 | [Octopus-v2-Q5_K_S.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q5_K_S.gguf) | Q5_K_S | 5 | 1.8 GB| large, low quality loss - recommended |
 | [Octopus-v2-Q6_K.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q6_K.gguf)     | Q6_K   | 6 | 2.06 GB| very large, extremely low quality loss |
 | [Octopus-v2-Q8_0.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-Q8_0.gguf)     | Q8_0   | 8 | 2.67 GB| very large, extremely low quality loss - not recommended |
+| [Octopus-v2-f16.gguf](https://huggingface.co/second-state/Octopus-v2-GGUF/blob/main/Octopus-v2-f16.gguf)     | f16   | 16 | 10 GB|  |
+*Quantized with llama.cpp b2589*