shc2012 commited on
Commit
3eae6d9
·
verified ·
1 Parent(s): c3b2950

Add Q8_0 GGUF quantization

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -1
  2. README.md +11 -11
  3. shenwen-coderV2-Q8_0.gguf +3 -0
.gitattributes CHANGED
@@ -34,4 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
- shenwen-coderV2-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ shenwen-coderV2-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,12 +1,12 @@
1
  ---
2
- quantization: Q5_K_M
3
  AIGC:
4
  ContentProducer: Minimax Agent AI
5
  ContentPropagator: shenwenAI
6
  Label: AIGC
7
  ---
8
 
9
- # shenwen-coderV2-Q5_K_M-GGUF
10
 
11
  <p align="center">
12
  <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
@@ -15,7 +15,7 @@ AIGC:
15
  <div align="center">
16
 
17
  [![GGUF Model](https://img.shields.io/badge/Model-shenwen--coderV2--GGUF-blue.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
18
- [![Quantization](https://img.shields.io/badge/Quantization-Q5_K_M-yellow.svg)]()
19
  [![Format](https://img.shields.io/badge/Format-GGUF-green.svg)]()
20
  [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)]()
21
 
@@ -23,17 +23,17 @@ AIGC:
23
 
24
  ## Model Overview
25
 
26
- **shenwen-coderV2-Q5_K_M-GGUF** is a quantized GGUF version of [shenwen-coderV2-Instruct](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct), optimized for efficient inference with llama.cpp and compatible tools.
27
 
28
  ## Quantization Details
29
 
30
  | Attribute | Value |
31
  |-----------|-------|
32
  | **Format** | GGUF |
33
- | **Quantization** | Q5_K_M |
34
- | **File Size** | ~401MB |
35
  | **Original Size** | ~949MB |
36
- | **Compression** | ~42% of original |
37
 
38
  ## Usage with llama.cpp
39
 
@@ -51,20 +51,20 @@ cmake .. && make -j
51
 
52
  ```bash
53
  # Download model
54
- wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/q5_k_m/shenwen-coderV2-Q5_K_M.gguf
55
 
56
  # Run inference
57
- ./build/bin/llama-cli -m shenwen-coderV2-Q5_K_M.gguf -n 512 -p "Write a Python function to calculate factorial:"
58
  ```
59
 
60
  ## Usage with Ollama
61
 
62
  ```bash
63
  # Pull the model
64
- ollama pull shenwenai/shenwen-coderV2:q5_k
65
 
66
  # Run inference
67
- ollama run shenwenai/shenwen-coderV2:q5_k "Write a hello world in Python"
68
  ```
69
 
70
  ## Model Source
 
1
  ---
2
+ quantization: Q8_0
3
  AIGC:
4
  ContentProducer: Minimax Agent AI
5
  ContentPropagator: shenwenAI
6
  Label: AIGC
7
  ---
8
 
9
+ # shenwen-coderV2-Q8_0-GGUF
10
 
11
  <p align="center">
12
  <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="Hugging Face" width="50" height="50">
 
15
  <div align="center">
16
 
17
  [![GGUF Model](https://img.shields.io/badge/Model-shenwen--coderV2--GGUF-blue.svg)](https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF)
18
+ [![Quantization](https://img.shields.io/badge/Quantization-Q8_0-yellow.svg)]()
19
  [![Format](https://img.shields.io/badge/Format-GGUF-green.svg)]()
20
  [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)]()
21
 
 
23
 
24
  ## Model Overview
25
 
26
+ **shenwen-coderV2-Q8_0-GGUF** is a quantized GGUF version of [shenwen-coderV2-Instruct](https://huggingface.co/shenwenAI/shenwen-coderV2-Instruct), optimized for efficient inference with llama.cpp and compatible tools.
27
 
28
  ## Quantization Details
29
 
30
  | Attribute | Value |
31
  |-----------|-------|
32
  | **Format** | GGUF |
33
+ | **Quantization** | Q8_0 |
34
+ | **File Size** | ~507MB |
35
  | **Original Size** | ~949MB |
36
+ | **Compression** | ~53% of original |
37
 
38
  ## Usage with llama.cpp
39
 
 
51
 
52
  ```bash
53
  # Download model
54
+ wget https://huggingface.co/shenwenAI/shenwen-coderV2-GGUF/resolve/main/q8_0/shenwen-coderV2-Q8_0.gguf
55
 
56
  # Run inference
57
+ ./build/bin/llama-cli -m shenwen-coderV2-Q8_0.gguf -n 512 -p "Write a Python function to calculate factorial:"
58
  ```
59
 
60
  ## Usage with Ollama
61
 
62
  ```bash
63
  # Pull the model
64
+ ollama pull shenwenai/shenwen-coderV2:q8_0
65
 
66
  # Run inference
67
+ ollama run shenwenai/shenwen-coderV2:q8_0 "Write a hello world in Python"
68
  ```
69
 
70
  ## Model Source
shenwen-coderV2-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77dce6534ce09a8026093ebc7a6e7bf19719d87ef956574f8db3b04943006d15
3
+ size 531067744