FoolDev Claude Opus 4.7 commited on
Commit
4c20ab5
·
1 Parent(s): b0d8482

Add Q3_K_S as a second HF/Ollama quant

Browse files

Ship Janus-27B.Q3_K_S.gguf (~12 GB) alongside the existing Q4_K_M
so users on tighter memory budgets (16 GB GPUs, 32 GB unified-memory
laptops) get a one-liner too:

ollama run hf.co/FoolDev/janus-27b # default Q4_K_M
ollama run hf.co/FoolDev/janus-27b:Q3_K_S # 12 GB option

Same Modelfile applies in both cases, so tool calling / thinking /
stop-token plumbing carries over. README TL;DR + Quick start +
"What's here" updated to surface the quant tag form.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (3) hide show
  1. CHANGELOG.md +12 -11
  2. Janus-27B.Q3_K_S.gguf +3 -0
  3. README.md +21 -11
CHANGELOG.md CHANGED
@@ -8,17 +8,18 @@ and documentation**, not the underlying base model.
8
  ## [Unreleased]
9
 
10
  ### Added
11
- - `Janus-27B.Q4_K_M.gguf` (~17 GB) shipped in the repo so HF's
12
- "Use this model" widget surfaces the Ollama snippet (`ollama run
13
- hf.co/FoolDev/janus-27b`) and Ollama can pull weights + Modelfile in
14
- one step. Q4_K_M for parity with the 35B sibling and the README's
15
- default. Adds `gguf` to the model-card tags so HF lists the repo
16
- among GGUF models. Loosens `.gitignore` from `*.gguf` to allow
17
- `Janus-27B.*.gguf` while still excluding the upstream
18
- `Qwen3.6-27B-*.gguf` that `make build` fetches locally. README's
19
- "does not redistribute weights" wording reworked: the wrapper
20
- philosophy still holds for the upstream Qwen GGUFs, but we ship a
21
- single tagged distribution to back the HF/Ollama UX.
 
22
 
23
  ### Fixed
24
  - `Modelfile`: ship a `TEMPLATE` directive mirroring Qwen 3.6 ChatML in
 
8
  ## [Unreleased]
9
 
10
  ### Added
11
+ - `Janus-27B.Q4_K_M.gguf` (~17 GB) and `Janus-27B.Q3_K_S.gguf`
12
+ (~12 GB) shipped in the repo so HF's "Use this model" widget
13
+ surfaces the Ollama snippet (`ollama run hf.co/FoolDev/janus-27b`)
14
+ and Ollama can pull weights + Modelfile in one step. Q4_K_M is the
15
+ default tag; `:Q3_K_S` is the smaller-footprint option for 16 GB
16
+ GPUs / unified-memory laptops. Adds `gguf` to the model-card tags
17
+ so HF lists the repo among GGUF models. Loosens `.gitignore` from
18
+ `*.gguf` to allow `Janus-27B.*.gguf` while still excluding the
19
+ upstream `Qwen3.6-27B-*.gguf` that `make build` fetches locally.
20
+ README's "does not redistribute weights" wording reworked: the
21
+ wrapper philosophy still holds for the upstream Qwen GGUFs, but we
22
+ ship two tagged distributions to back the HF/Ollama UX.
23
 
24
  ### Fixed
25
  - `Modelfile`: ship a `TEMPLATE` directive mirroring Qwen 3.6 ChatML in
Janus-27B.Q3_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4afb4abcf0207a484b0d7e92c0421b74e8ce1c7a7250bb9d824b79288da68f20
3
+ size 12358727904
README.md CHANGED
@@ -61,19 +61,20 @@ A personal sibling to [`FoolDev/janus`](https://huggingface.co/FoolDev/janus). S
61
 
62
  ## TL;DR
63
 
64
- One-liner via Hugging Face (pulls the 17 GB Q4_K_M GGUF + this repo's
65
- Modelfile, including the tool-calling TEMPLATE):
66
 
67
  ```bash
68
- ollama run hf.co/FoolDev/janus-27b
 
69
  ```
70
 
71
- Or build locally if you want a different quant:
72
 
73
  ```bash
74
  git clone https://huggingface.co/FoolDev/janus-27b && cd janus-27b
75
- make build # downloads ~17 GB Q4_K_M GGUF
76
- make build QUANT=Q3_K_S # smaller ~12 GB quant
77
  ollama run janus-27b
78
  ```
79
 
@@ -119,9 +120,17 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
119
  | `CHANGELOG.md` | Versioned tooling/docs changes |
120
  | `README.md` | This file |
121
 
122
- This repo ships **one** GGUF `Janus-27B.Q4_K_M.gguf` (~17 GB) to
123
- back the HF/Ollama "Use this model" widget (`ollama run
124
- hf.co/FoolDev/janus-27b`). For other quants or local builds, pull from
 
 
 
 
 
 
 
 
125
  [`unsloth/Qwen3.6-27B-GGUF`](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF)
126
  and `make build QUANT=...` — the Modelfile here is the same one Ollama
127
  applies in either path.
@@ -150,8 +159,9 @@ If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-2
150
  Two paths:
151
 
152
  ```bash
153
- # A. Pull straight from HF (uses the bundled Q4_K_M + this Modelfile):
154
- ollama run hf.co/FoolDev/janus-27b
 
155
 
156
  # B. Build locally (lets you pick the quant):
157
  make build # Q4_K_M -> janus-27b
 
61
 
62
  ## TL;DR
63
 
64
+ One-liner via Hugging Face (pulls a GGUF + this repo's Modelfile,
65
+ including the tool-calling TEMPLATE):
66
 
67
  ```bash
68
+ ollama run hf.co/FoolDev/janus-27b # default ~17 GB Q4_K_M
69
+ ollama run hf.co/FoolDev/janus-27b:Q3_K_S # tighter ~12 GB quant
70
  ```
71
 
72
+ Or build locally for any other quant:
73
 
74
  ```bash
75
  git clone https://huggingface.co/FoolDev/janus-27b && cd janus-27b
76
+ make build # downloads Q4_K_M (default)
77
+ make build QUANT=Q5_K_M # any unsloth/Qwen3.6-27B-GGUF quant
78
  ollama run janus-27b
79
  ```
80
 
 
120
  | `CHANGELOG.md` | Versioned tooling/docs changes |
121
  | `README.md` | This file |
122
 
123
+ This repo ships two GGUFs to back the HF/Ollama "Use this model"
124
+ widget `Janus-27B.Q4_K_M.gguf` (~17 GB, default) and
125
+ `Janus-27B.Q3_K_S.gguf` (~12 GB, smaller-footprint option for 16 GB
126
+ GPUs / unified-memory laptops):
127
+
128
+ ```bash
129
+ ollama run hf.co/FoolDev/janus-27b # default Q4_K_M
130
+ ollama run hf.co/FoolDev/janus-27b:Q3_K_S # tighter quant
131
+ ```
132
+
133
+ For other quants or local builds, pull from
134
  [`unsloth/Qwen3.6-27B-GGUF`](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF)
135
  and `make build QUANT=...` — the Modelfile here is the same one Ollama
136
  applies in either path.
 
159
  Two paths:
160
 
161
  ```bash
162
+ # A. Pull straight from HF (uses the bundled Q4_K_M / Q3_K_S + Modelfile):
163
+ ollama run hf.co/FoolDev/janus-27b # default Q4_K_M
164
+ ollama run hf.co/FoolDev/janus-27b:Q3_K_S # tighter quant
165
 
166
  # B. Build locally (lets you pick the quant):
167
  make build # Q4_K_M -> janus-27b