Add Q8_0 GGUF quantization
Browse files- .gitattributes +1 -1
- README.md +11 -11
- shenwen-coderV2-Q8_0.gguf +3 -0
.gitattributes
CHANGED
|
@@ -34,4 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
| 37 |
-
shenwen-coderV2-
|
|
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
shenwen-coderV2-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
---
|
| 2 |
-
quantization:
|
| 3 |
AIGC:
|
| 4 |
ContentProducer: Minimax Agent AI
|
| 5 |
ContentPropagator: shenwenAI
|
| 6 |
Label: AIGC
|
| 7 |
---
|
| 8 |
|
| 9 |
-
# shenwen-coderV2-
|
| 10 |
|
| 11 |
<p align="center">
|
| 12 |
<img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
|
|
@@ -15,7 +15,7 @@ AIGC:
|
|
| 15 |
<div align="center">
|
| 16 |
|
| 17 |
[](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
|
| 18 |
-
[]()
|
| 20 |
[]()
|
| 21 |
|
|
@@ -23,17 +23,17 @@ AIGC:
|
|
| 23 |
|
| 24 |
## Model Overview
|
| 25 |
|
| 26 |
-
**shenwen-coderV2-
|
| 27 |
|
| 28 |
## Quantization Details
|
| 29 |
|
| 30 |
| Attribute | Value |
|
| 31 |
|-----------|-------|
|
| 32 |
| **Format** | GGUF |
|
| 33 |
-
| **Quantization** |
|
| 34 |
-
| **File Size** | ~
|
| 35 |
| **Original Size** | ~949MB |
|
| 36 |
-
| **Compression** | ~
|
| 37 |
|
| 38 |
## Usage with llama.cpp
|
| 39 |
|
|
@@ -51,20 +51,20 @@ cmake .. && make -j
|
|
| 51 |
|
| 52 |
```bash
|
| 53 |
# Download model
|
| 54 |
-
wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/
|
| 55 |
|
| 56 |
# Run inference
|
| 57 |
-
./build/bin/llama-cli -m shenwen-coderV2-
|
| 58 |
```
|
| 59 |
|
| 60 |
## Usage with Ollama
|
| 61 |
|
| 62 |
```bash
|
| 63 |
# Pull the model
|
| 64 |
-
ollama pull shenwenai/shenwen-coderV2:
|
| 65 |
|
| 66 |
# Run inference
|
| 67 |
-
ollama run shenwenai/shenwen-coderV2:
|
| 68 |
```
|
| 69 |
|
| 70 |
## Model Source
|
|
|
|
| 1 |
---
|
| 2 |
+
quantization: Q8_0
|
| 3 |
AIGC:
|
| 4 |
ContentProducer: Minimax Agent AI
|
| 5 |
ContentPropagator: shenwenAI
|
| 6 |
Label: AIGC
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# shenwen-coderV2-Q8_0-GGUF
|
| 10 |
|
| 11 |
<p align="center">
|
| 12 |
<img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
|
|
|
|
| 15 |
<div align="center">
|
| 16 |
|
| 17 |
[](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
|
| 18 |
+
[]()
|
| 19 |
[]()
|
| 20 |
[]()
|
| 21 |
|
|
|
|
| 23 |
|
| 24 |
## Model Overview
|
| 25 |
|
| 26 |
+
**shenwen-coderV2-Q8_0-GGUF** is a quantized GGUF version of [shenwen-coderV2-Instruct](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct), optimized for efficient inference with llama.cpp and compatible tools.
|
| 27 |
|
| 28 |
## Quantization Details
|
| 29 |
|
| 30 |
| Attribute | Value |
|
| 31 |
|-----------|-------|
|
| 32 |
| **Format** | GGUF |
|
| 33 |
+
| **Quantization** | Q8_0 |
|
| 34 |
+
| **File Size** | ~507MB |
|
| 35 |
| **Original Size** | ~949MB |
|
| 36 |
+
| **Compression** | ~53% of original |
|
| 37 |
|
| 38 |
## Usage with llama.cpp
|
| 39 |
|
|
|
|
| 51 |
|
| 52 |
```bash
|
| 53 |
# Download model
|
| 54 |
+
wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/q8_0/shenwen-coderV2-Q8_0.gguf
|
| 55 |
|
| 56 |
# Run inference
|
| 57 |
+
./build/bin/llama-cli -m shenwen-coderV2-Q8_0.gguf -n 512 -p "Write a Python function to calculate factorial:"
|
| 58 |
```
|
| 59 |
|
| 60 |
## Usage with Ollama
|
| 61 |
|
| 62 |
```bash
|
| 63 |
# Pull the model
|
| 64 |
+
ollama pull shenwenai/shenwen-coderV2:q8_0
|
| 65 |
|
| 66 |
# Run inference
|
| 67 |
+
ollama run shenwenai/shenwen-coderV2:q8_0 "Write a hello world in Python"
|
| 68 |
```
|
| 69 |
|
| 70 |
## Model Source
|
shenwen-coderV2-Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:77dce6534ce09a8026093ebc7a6e7bf19719d87ef956574f8db3b04943006d15
|
| 3 |
+
size 531067744
|