Spaces:
No application file
No application file
Update README.md
Browse files
README.md
CHANGED
|
@@ -5,6 +5,7 @@ colorFrom: purple
|
|
| 5 |
colorTo: pink
|
| 6 |
sdk: gradio
|
| 7 |
pinned: false
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
# Halley AI on Hugging Face
|
|
@@ -17,12 +18,10 @@ High-quality, Apple-Silicon–optimized **MLX** builds, tools, and evals — foc
|
|
| 17 |
|
| 18 |
## 🚀 Featured models
|
| 19 |
|
| 20 |
-
| Repo | Bits
|
| 21 |
|---|---:|---:|---|
|
| 22 |
| **HalleyAI/gpt-oss-20b-MLX-4bit-gs32** | Q4 / 32 | ~13.1 GB | Best speed on 32 GB; near-baseline quality (+1.81% PPL vs 8-bit) |
|
| 23 |
| **HalleyAI/gpt-oss-20b-MLX-6bit-gs32** | Q6 / 32 | ~18.4 GB | Near-Q8 fidelity (-0.51% PPL vs 8-bit) |
|
| 24 |
| **Reference (8-bit)** | Q8 / 32 | — | Use upstream: `lmstudio-community/gpt-oss-20b-MLX-8bit` |
|
| 25 |
|
| 26 |
-
> **Format:** MLX (not GGUF). For Linux/Windows or non-MLX stacks, use a GGUF build with llama.cpp.
|
| 27 |
-
|
| 28 |
-
|
|
|
|
| 5 |
colorTo: pink
|
| 6 |
sdk: gradio
|
| 7 |
pinned: false
|
| 8 |
+
sdk_version: 5.42.0
|
| 9 |
---
|
| 10 |
|
| 11 |
# Halley AI on Hugging Face
|
|
|
|
| 18 |
|
| 19 |
## 🚀 Featured models
|
| 20 |
|
| 21 |
+
| Repo | Bits/GS | Footprint | Notes |
|
| 22 |
|---|---:|---:|---|
|
| 23 |
| **HalleyAI/gpt-oss-20b-MLX-4bit-gs32** | Q4 / 32 | ~13.1 GB | Best speed on 32 GB; near-baseline quality (+1.81% PPL vs 8-bit) |
|
| 24 |
| **HalleyAI/gpt-oss-20b-MLX-6bit-gs32** | Q6 / 32 | ~18.4 GB | Near-Q8 fidelity (-0.51% PPL vs 8-bit) |
|
| 25 |
| **Reference (8-bit)** | Q8 / 32 | — | Use upstream: `lmstudio-community/gpt-oss-20b-MLX-8bit` |
|
| 26 |
|
| 27 |
+
> **Format:** MLX (not GGUF). For Linux/Windows or non-MLX stacks, use a GGUF build with llama.cpp.
|
|
|
|
|
|