majentik commited on
Commit
f82c4b7
·
verified ·
1 Parent(s): 8c881d1

docs: Tier 2 polish — variant matrix + quant trade-off

Browse files
Files changed (1) hide show
  1. README.md +42 -10
README.md CHANGED
@@ -2,18 +2,16 @@
2
  base_model: openai/gpt-oss-20b
3
  library_name: mlx
4
  tags:
5
- - rotorquant
6
- - kv-cache-quantization
7
- - gpt-oss
8
- - openai
9
- - moe
10
- - quantized
11
- - mlx
12
- - 8bit
13
  license: apache-2.0
14
  pipeline_tag: text-generation
15
- language:
16
- - en
17
  ---
18
 
19
  # GPT-OSS-20B - RotorQuant MLX 8-bit
@@ -89,3 +87,37 @@ This model requires approximately 20 GB of unified memory. Recommended hardware:
89
  - [majentik/gpt-oss-20b-TurboQuant-MLX-8bit](https://huggingface.co/majentik/gpt-oss-20b-TurboQuant-MLX-8bit) -- TurboQuant MLX 8-bit variant
90
  - [RotorQuant GitHub](https://github.com/scrya-com/rotorquant)
91
  - [MLX Framework](https://github.com/ml-explore/mlx)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  base_model: openai/gpt-oss-20b
3
  library_name: mlx
4
  tags:
5
+ - rotorquant
6
+ - kv-cache-quantization
7
+ - gpt-oss
8
+ - openai
9
+ - moe
10
+ - quantized
11
+ - mlx
12
+ - 8bit
13
  license: apache-2.0
14
  pipeline_tag: text-generation
 
 
15
  ---
16
 
17
  # GPT-OSS-20B - RotorQuant MLX 8-bit
 
87
  - [majentik/gpt-oss-20b-TurboQuant-MLX-8bit](https://huggingface.co/majentik/gpt-oss-20b-TurboQuant-MLX-8bit) -- TurboQuant MLX 8-bit variant
88
  - [RotorQuant GitHub](https://github.com/scrya-com/rotorquant)
89
  - [MLX Framework](https://github.com/ml-explore/mlx)
90
+
91
+ ## Quant trade-off (MLX lane)
92
+
93
+ | Bits | Approx size | Use case | Recommendation |
94
+ |---|---|---|---|
95
+ | 2-bit | ~5.2 GB | Aggressive quantization | Very low-RAM Macs |
96
+ | 3-bit | ~7.2 GB | Lossy but small | Low-RAM Macs |
97
+ | 4-bit | ~8.4 GB | Balanced default | Recommended for most Macs |
98
+ | 5-bit | ~10 GB | Higher fidelity | Quality-sensitive |
99
+ | 6-bit | ~12 GB | Approaching FP16 quality | High-fidelity |
100
+ | **8-bit** | ~15 GB | Near-lossless reference | **Fidelity-critical work** |
101
+
102
+ (Current variant — **8bit** — is bolded.)
103
+
104
+ ## Variants in this family
105
+
106
+ (Showing 14 sibling variants under `majentik/gpt-oss-20b-*`. The current variant — `RotorQuant-MLX-8bit` — is **bolded**.)
107
+
108
+ | Variant | Runtime | Approx size | Use case |
109
+ |---|---|---|---|
110
+ | [RotorQuant](https://huggingface.co/majentik/gpt-oss-20b-rotorquant) | runtime modifier | n/a | KV-cache root (weight-agnostic) |
111
+ | [RotorQuant-GGUF-IQ4_XS](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-gguf-IQ4_XS) | llama.cpp | ~17 GB | Lossy 4-bit, low-RAM CPU/edge |
112
+ | [RotorQuant-GGUF-Q2_K](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-gguf-Q2_K) | llama.cpp | ~12 GB | Lossy, low-RAM CPU/edge |
113
+ | [RotorQuant-GGUF-Q3_K_M](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-gguf-Q3_K_M) | llama.cpp | ~16 GB | Smaller 3-bit, CPU-friendly |
114
+ | [RotorQuant-GGUF-Q4_K_M](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-gguf-Q4_K_M) | llama.cpp | ~22 GB | Balanced default |
115
+ | [RotorQuant-GGUF-Q5_K_M](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-gguf-Q5_K_M) | llama.cpp | ~26 GB | Higher fidelity, more RAM |
116
+ | [RotorQuant-GGUF-Q8_0](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-gguf-Q8_0) | llama.cpp | ~42 GB | Near-lossless reference |
117
+ | [RotorQuant-MLX-2bit](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-mlx-2bit) | mlx-lm | ~6.4 GB | Apple Silicon, smallest |
118
+ | [RotorQuant-MLX-4bit](https://huggingface.co/majentik/gpt-oss-20b-rotorquant-mlx-4bit) | mlx-lm | ~12 GB | Apple Silicon balanced |
119
+ | **RotorQuant-MLX-8bit** | mlx-lm | ~24 GB | Apple Silicon reference |
120
+ | [TurboQuant](https://huggingface.co/majentik/gpt-oss-20b-turboquant) | runtime modifier | n/a | KV-cache root (weight-agnostic) |
121
+ | [TurboQuant-MLX-2bit](https://huggingface.co/majentik/gpt-oss-20b-turboquant-mlx-2bit) | mlx-lm | ~6.4 GB | Apple Silicon, smallest |
122
+ | [TurboQuant-MLX-4bit](https://huggingface.co/majentik/gpt-oss-20b-turboquant-mlx-4bit) | mlx-lm | ~12 GB | Apple Silicon balanced |
123
+ | [TurboQuant-MLX-8bit](https://huggingface.co/majentik/gpt-oss-20b-turboquant-mlx-8bit) | mlx-lm | ~24 GB | Apple Silicon reference |