Update README.md
Browse files
README.md
CHANGED
|
@@ -4,10 +4,32 @@ tags:
|
|
| 4 |
- gguf
|
| 5 |
- llama.cpp
|
| 6 |
- quantized
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
-
#
|
| 10 |
|
| 11 |
-
This repository contains the
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
- gguf
|
| 5 |
- llama.cpp
|
| 6 |
- quantized
|
| 7 |
+
- deepseek
|
| 8 |
+
- stheno
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# DeepSeek Sunfall Merged - GGUF Quantized Models
|
| 12 |
|
| 13 |
+
This repository contains multiple **quantized GGUF variants** of the merged DeepSeek + Sunfall model, compatible with `llama.cpp`.
|
| 14 |
|
| 15 |
+
## 🧠 Available Quantized Formats
|
| 16 |
+
|
| 17 |
+
| Format | File Name | Description |
|
| 18 |
+
|-------------|--------------------------------------------------|---------------------------------|
|
| 19 |
+
| Q3_K_M | `deepseek_sunfall_merged_Model.Q3_K_M.gguf` | Smallest size, fastest inference |
|
| 20 |
+
| Q4_K_M | `deepseek_sunfall_merged_Model.Q4_K_M.gguf` | Balanced speed & performance |
|
| 21 |
+
| Q5_K_M | `deepseek_sunfall_merged_Model.Q5_K_M.gguf` | Better quality, slower |
|
| 22 |
+
| Q6_K | `deepseek_sunfall_merged_Model.Q6_K.gguf` | Near full precision |
|
| 23 |
+
| Q8_0 | `deepseek_sunfall_merged_Model.Q8_0.gguf` | Almost no compression loss |
|
| 24 |
+
|
| 25 |
+
## 🔧 Usage (Python)
|
| 26 |
+
|
| 27 |
+
Install `llama-cpp-python`:
|
| 28 |
+
|
| 29 |
+
```bash
|
| 30 |
+
pip install llama-cpp-python
|
| 31 |
+
from llama_cpp import Llama
|
| 32 |
+
|
| 33 |
+
model = Llama(model_path="deepseek_sunfall_merged_Model.Q4_K_M.gguf") # or Q3_K_M, etc.
|
| 34 |
+
output = model("Tell me a story about stars.")
|
| 35 |
+
print(output)```
|