File size: 5,687 Bytes
cd91188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7e95b8
 
cd91188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
base_model: ByteDance-Seed/Seed-OSS-36B-Instruct
model_creator: ByteDance-Seed
model_name: Seed-OSS-36B-Instruct
quantized_by: Second State Inc.
pipeline_tag: text-generation
library_name: transformers
---

<!-- header start -->
<!-- 200823 -->
<div style="width: auto; margin-left: auto; margin-right: auto">
<img src="https://github.com/LlamaEdge/LlamaEdge/raw/dev/assets/logo.svg" style="width: 100%; min-width: 400px; display: block; margin: auto;">
</div>
<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
<!-- header end -->

# Seed-OSS-36B-Instruct-GGUF

## Original Model

[ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)

## Run with LlamaEdge

- LlamaEdge version: coming soon

<!-- - LlamaEdge version: [v0.25.1](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.25.1) and above -->

- Prompt template

  - Prompt type:
 
    - `seed-oss-think` for think mode
    - `seed-oss-no-think` for no think mode

  - Prompt string
    - `Thinking` mode

      ```text
      <seed:bos>system
      You are Doubao, a helpful AI assistant.
      <seed:eos>

      <seed:bos>user
      {user_message_1}
      <seed:eos>

      <seed:bos>assistant
      <seed:think>{thinking_content}</seed:think>
      {assistant_message_1}
      <seed:eos>

      <seed:bos>user
      {user_message_2}
      <seed:eos>

      <seed:bos>assistant
      ```

    - `No-thinking` mode

      ```text
      <seed:bos>system
      You are Doubao, a helpful AI assistant.
      <seed:eos>

      <seed:bos>system
      You are an intelligent assistant that can answer questions in one step without the need for reasoning and thinking, that is, your thinking budget is 0. Next, please skip       the thinking process and directly start answering the user's questions.
      <seed:eos>

      <seed:bos>user
      {user_message_1}
      <seed:eos>

      <seed:bos>assistant
      {assistant_message_1}
      <seed:eos>

      <seed:bos>user
      {user_message_2}
      <seed:eos>

      <seed:bos>assistant
      ```

- Context size: `512000`

- Run as LlamaEdge service

  ```bash
  wasmedge --dir .:. \
    --nn-preload default:GGML:AUTO:Seed-OSS-36B-Instruct-Q5_K_M.gguf \
    llama-api-server.wasm \
    --prompt-template seed-oss-no-think \
    --ctx-size 512000 \
    --model-name seed-oss
  ```

## Quantized GGUF Models

| Name | Quant method | Bits | Size | Use case |
| ---- | ---- | ---- | ---- | ----- |
| [Seed-OSS-36B-Instruct-Q2_K.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q2_K.gguf)     | Q2_K   | 2 | 13.6 GB| smallest, significant quality loss - not recommended for most purposes |
| [Seed-OSS-36B-Instruct-Q3_K_L.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q3_K_L.gguf) | Q3_K_L | 3 | 19.1 GB| small, substantial quality loss |
| [Seed-OSS-36B-Instruct-Q3_K_M.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q3_K_M.gguf) | Q3_K_M | 3 | 17.6 GB| very small, high quality loss |
| [Seed-OSS-36B-Instruct-Q3_K_S.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q3_K_S.gguf) | Q3_K_S | 3 | 15.9 GB| very small, high quality loss |
| [Seed-OSS-36B-Instruct-Q4_0.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q4_0.gguf)     | Q4_0   | 4 | 20.6 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
| [Seed-OSS-36B-Instruct-Q4_K_M.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q4_K_M.gguf) | Q4_K_M | 4 | 21.8 GB| medium, balanced quality - recommended |
| [Seed-OSS-36B-Instruct-Q4_K_S.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q4_K_S.gguf) | Q4_K_S | 4 | 20.7 GB| small, greater quality loss |
| [Seed-OSS-36B-Instruct-Q5_0.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q5_0.gguf)     | Q5_0   | 5 | 25.0 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
| [Seed-OSS-36B-Instruct-Q5_K_M.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q5_K_M.gguf) | Q5_K_M | 5 | 25.6 GB| large, very low quality loss - recommended |
| [Seed-OSS-36B-Instruct-Q5_K_S.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q5_K_S.gguf) | Q5_K_S | 5 | 25.0 GB| large, low quality loss - recommended |
| [Seed-OSS-36B-Instruct-Q6_K.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q6_K.gguf)     | Q6_K   | 6 | 29.7 GB| very large, extremely low quality loss |
| [Seed-OSS-36B-Instruct-Q8_0.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-Q8_0.gguf)     | Q8_0   | 8 | 38.4 GB| very large, extremely low quality loss - not recommended |
| [Seed-OSS-36B-Instruct-f16-00001-of-00003.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-f16-00001-of-00003.gguf)       | f16   | 16 | 30.0 GB|  |
| [Seed-OSS-36B-Instruct-f16-00002-of-00003.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-f16-00002-of-00003.gguf)       | f16   | 16 | 30.0 GB|  |
| [Seed-OSS-36B-Instruct-f16-00003-of-00003.gguf](https://huggingface.co/second-state/Seed-OSS-36B-Instruct-GGUF/blob/main/Seed-OSS-36B-Instruct-f16-00003-of-00003.gguf)       | f16   | 16 | 12.4 GB|  |

*Quantized with llama.cpp b6301.*