TobDeBer commited on
Commit
49555f0
·
verified ·
1 Parent(s): da3bba5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -1,3 +1,31 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ## Content
5
+ This model area holds the public parts of converted gguf models using Skipper (T3) or Mate (M8) technology.
6
+ Future modes will also follow the nautic theme.
7
+
8
+ ## Versions
9
+ | Version | Codename | Fileprefix | typical bpw range | new feature |
10
+ | :--- | :--- | :--- | :--- | :--- |
11
+ | 1.0 | Skipper | T3 and T2 | 0.8 .. 2.2 | introduce new compression method |
12
+ | 1.5 | Mate | M8 | 0.4 .. 2 | compression improvements |
13
+ | 2.0 | Cheng | Cx | 0.3 .. 2 | speed improvements |
14
+ | 2.5 | Cheng++ | Cy | 0.1 .. 2 | reduce compute requirements |
15
+
16
+ V1 does reduce model size significantly at same subjective quality, but leaves compute requirements high.
17
+
18
+ V2 will scale down compute requirements and support cheap NPUs
19
+
20
+ ## expected bpw (bit per weight)
21
+ Actual bpw are higher for small models and lower for larger models. Similar to JPEG and video encoding, higher input quality opens more opportunity for compression.
22
+
23
+ | Base | Mode | % | bpw@30b |
24
+ | :--- | :--- | :-: | :--- |
25
+ | Q5_K | T3UD | 95 | 2 .. 2.2 |
26
+ | Q4_K | T2UD | 90 | 1.4 .. 1.6 |
27
+ | Q2_K | T2UD2 | 75 | 1 .. 1.2 |
28
+ | Q2_K | T2UD1 | 60 | 0.8 |
29
+ | Q2_K | M8HQ | 75 | 0.8 |
30
+ | Q2_K | M8LQ | 60 | 0.4 .. 0.6 |
31
+