Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,31 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
## Content
|
| 5 |
+
This model area holds the public parts of converted gguf models using Skipper (T3) or Mate (M8) technology.
|
| 6 |
+
Future modes will also follow the nautic theme.
|
| 7 |
+
|
| 8 |
+
## Versions
|
| 9 |
+
| Version | Codename | Fileprefix | typical bpw range | new feature |
|
| 10 |
+
| :--- | :--- | :--- | :--- | :--- |
|
| 11 |
+
| 1.0 | Skipper | T3 and T2 | 0.8 .. 2.2 | introduce new compression method |
|
| 12 |
+
| 1.5 | Mate | M8 | 0.4 .. 2 | compression improvements |
|
| 13 |
+
| 2.0 | Cheng | Cx | 0.3 .. 2 | speed improvements |
|
| 14 |
+
| 2.5 | Cheng++ | Cy | 0.1 .. 2 | reduce compute requirements |
|
| 15 |
+
|
| 16 |
+
V1 does reduce model size significantly at same subjective quality, but leaves compute requirements high.
|
| 17 |
+
|
| 18 |
+
V2 will scale down compute requirements and support cheap NPUs
|
| 19 |
+
|
| 20 |
+
## expected bpw (bit per weight)
|
| 21 |
+
Actual bpw are higher for small models and lower for larger models. Similar to JPEG and video encoding, higher input quality opens more opportunity for compression.
|
| 22 |
+
|
| 23 |
+
| Base | Mode | % | bpw@30b |
|
| 24 |
+
| :--- | :--- | :-: | :--- |
|
| 25 |
+
| Q5_K | T3UD | 95 | 2 .. 2.2 |
|
| 26 |
+
| Q4_K | T2UD | 90 | 1.4 .. 1.6 |
|
| 27 |
+
| Q2_K | T2UD2 | 75 | 1 .. 1.2 |
|
| 28 |
+
| Q2_K | T2UD1 | 60 | 0.8 |
|
| 29 |
+
| Q2_K | M8HQ | 75 | 0.8 |
|
| 30 |
+
| Q2_K | M8LQ | 60 | 0.4 .. 0.6 |
|
| 31 |
+
|