Add Q8_0 GGUF quantization

Browse files

Files changed (3) hide show

.gitattributes +1 -1
README.md +11 -11
shenwen-coderV2-Q8_0.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -34,4 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
-shenwen-coderV2-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text

 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
+shenwen-coderV2-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,12 +1,12 @@
 ---
-quantization: Q5_K_M
 AIGC:
     ContentProducer: Minimax Agent AI
     ContentPropagator: shenwenAI
     Label: AIGC
 ---
-# shenwen-coderV2-Q5_K_M-GGUF
 <p align="center">
   <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
@@ -15,7 +15,7 @@ AIGC:
 <div align="center">
 [![GGUF Model](https://img.shields.io/badge/Model-shenwen--coderV2--GGUF-blue.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
-[![Quantization](https://img.shields.io/badge/Quantization-Q5_K_M-yellow.svg)]()
 [![Format](https://img.shields.io/badge/Format-GGUF-green.svg)]()
 [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)]()
@@ -23,17 +23,17 @@ AIGC:
 ## Model Overview
-**shenwen-coderV2-Q5_K_M-GGUF** is a quantized GGUF version of [shenwen-coderV2-Instruct](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct), optimized for efficient inference with llama.cpp and compatible tools.
 ## Quantization Details
 | Attribute | Value |
 |-----------|-------|
 | **Format** | GGUF |
-| **Quantization** | Q5_K_M |
-| **File Size** | ~401MB |
 | **Original Size** | ~949MB |
-| **Compression** | ~42% of original |
 ## Usage with llama.cpp
@@ -51,20 +51,20 @@ cmake .. && make -j
 ```bash
 # Download model
-wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/q5_k_m/shenwen-coderV2-Q5_K_M.gguf
 # Run inference
-./build/bin/llama-cli -m shenwen-coderV2-Q5_K_M.gguf -n 512 -p "Write a Python function to calculate factorial:"
 ```
 ## Usage with Ollama
 ```bash
 # Pull the model
-ollama pull shenwenai/shenwen-coderV2:q5_k
 # Run inference
-ollama run shenwenai/shenwen-coderV2:q5_k "Write a hello world in Python"
 ```
 ## Model Source

 ---
+quantization: Q8_0
 AIGC:
     ContentProducer: Minimax Agent AI
     ContentPropagator: shenwenAI
     Label: AIGC
 ---
+# shenwen-coderV2-Q8_0-GGUF
 <p align="center">
   <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
 <div align="center">
 [![GGUF Model](https://img.shields.io/badge/Model-shenwen--coderV2--GGUF-blue.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
+[![Quantization](https://img.shields.io/badge/Quantization-Q8_0-yellow.svg)]()
 [![Format](https://img.shields.io/badge/Format-GGUF-green.svg)]()
 [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)]()
 ## Model Overview
+**shenwen-coderV2-Q8_0-GGUF** is a quantized GGUF version of [shenwen-coderV2-Instruct](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct), optimized for efficient inference with llama.cpp and compatible tools.
 ## Quantization Details
 | Attribute | Value |
 |-----------|-------|
 | **Format** | GGUF |
+| **Quantization** | Q8_0 |
+| **File Size** | ~507MB |
 | **Original Size** | ~949MB |
+| **Compression** | ~53% of original |
 ## Usage with llama.cpp
 ```bash
 # Download model
+wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/q8_0/shenwen-coderV2-Q8_0.gguf
 # Run inference
+./build/bin/llama-cli -m shenwen-coderV2-Q8_0.gguf -n 512 -p "Write a Python function to calculate factorial:"
 ```
 ## Usage with Ollama
 ```bash
 # Pull the model
+ollama pull shenwenai/shenwen-coderV2:q8_0
 # Run inference
+ollama run shenwenai/shenwen-coderV2:q8_0 "Write a hello world in Python"
 ```
 ## Model Source

shenwen-coderV2-Q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:77dce6534ce09a8026093ebc7a6e7bf19719d87ef956574f8db3b04943006d15
+size 531067744