sebastavar commited on
Commit
9404600
·
verified ·
1 Parent(s): 7cbd15f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -5,6 +5,7 @@ colorFrom: purple
5
  colorTo: pink
6
  sdk: gradio
7
  pinned: false
 
8
  ---
9
 
10
  # Halley AI on Hugging Face
@@ -17,12 +18,10 @@ High-quality, Apple-Silicon–optimized **MLX** builds, tools, and evals — foc
17
 
18
  ## 🚀 Featured models
19
 
20
- | Repo | Bits / GS | Footprint | Notes |
21
  |---|---:|---:|---|
22
  | **HalleyAI/gpt-oss-20b-MLX-4bit-gs32** | Q4 / 32 | ~13.1 GB | Best speed on 32 GB; near-baseline quality (+1.81% PPL vs 8-bit) |
23
  | **HalleyAI/gpt-oss-20b-MLX-6bit-gs32** | Q6 / 32 | ~18.4 GB | Near-Q8 fidelity (-0.51% PPL vs 8-bit) |
24
  | **Reference (8-bit)** | Q8 / 32 | — | Use upstream: `lmstudio-community/gpt-oss-20b-MLX-8bit` |
25
 
26
- > **Format:** MLX (not GGUF). For Linux/Windows or non-MLX stacks, use a GGUF build with llama.cpp.
27
-
28
-
 
5
  colorTo: pink
6
  sdk: gradio
7
  pinned: false
8
+ sdk_version: 5.42.0
9
  ---
10
 
11
  # Halley AI on Hugging Face
 
18
 
19
  ## 🚀 Featured models
20
 
21
+ | Repo | Bits/GS | Footprint | Notes |
22
  |---|---:|---:|---|
23
  | **HalleyAI/gpt-oss-20b-MLX-4bit-gs32** | Q4 / 32 | ~13.1 GB | Best speed on 32 GB; near-baseline quality (+1.81% PPL vs 8-bit) |
24
  | **HalleyAI/gpt-oss-20b-MLX-6bit-gs32** | Q6 / 32 | ~18.4 GB | Near-Q8 fidelity (-0.51% PPL vs 8-bit) |
25
  | **Reference (8-bit)** | Q8 / 32 | — | Use upstream: `lmstudio-community/gpt-oss-20b-MLX-8bit` |
26
 
27
+ > **Format:** MLX (not GGUF). For Linux/Windows or non-MLX stacks, use a GGUF build with llama.cpp.