FoolDev Claude Opus 4.7 commited on
Commit
b0d8482
·
1 Parent(s): f4e7fa4

Ship a Q4_K_M GGUF so HF "Use this model" surfaces an Ollama snippet

Browse files

Until now this repo deliberately shipped no weights — users had to
clone, run make build, and let scripts/build.sh fetch a quant from
unsloth/Qwen3.6-27B-GGUF. That kept the repo small but meant HF's
"Use this model" widget could only show the Transformers snippet (HF
requires a GGUF in the repo to auto-generate the Ollama snippet).

Ship a single Janus-27B.Q4_K_M.gguf (~17 GB) so:
- `ollama run hf.co/FoolDev/janus-27b` works one-liner — Ollama
pulls this GGUF and applies the Modelfile in this repo (TEMPLATE,
PARAMETERs, SYSTEM all flow through, including the tool-calling
TEMPLATE we wired up in 80f4494).
- HF lists the repo under GGUF models (added `gguf` to tags).

Q4_K_M for parity with the 35B sibling and the README's default;
local builders who want Q3_K_S use `make build QUANT=Q3_K_S` as
before. Loosened .gitignore from `*.gguf` to `*.gguf` +
`!Janus-27B.*.gguf` so the upstream Qwen3.6-27B-*.gguf that
scripts/build.sh fetches is still ignored. README's "does not
redistribute weights" wording reworked — the wrapper philosophy still
holds for the upstream Qwen GGUFs, we just ship one tagged
distribution to back the HF/Ollama UX.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (4) hide show
  1. .gitignore +5 -1
  2. CHANGELOG.md +13 -0
  3. Janus-27B.Q4_K_M.gguf +3 -0
  4. README.md +22 -8
.gitignore CHANGED
@@ -5,8 +5,12 @@ __pycache__/
5
  .venv/
6
  venv/
7
 
8
- # Local model weights (we don't redistribute these)
 
 
 
9
  *.gguf
 
10
  *.safetensors
11
  *.bin
12
 
 
5
  .venv/
6
  venv/
7
 
8
+ # Local model weights. We don't redistribute the upstream Qwen GGUFs
9
+ # here — `make build` fetches one from unsloth/Qwen3.6-27B-GGUF locally.
10
+ # The single Janus-27B.*.gguf we DO ship backs the HF/Ollama
11
+ # "Use this model" widget (ollama run hf.co/FoolDev/janus-27b).
12
  *.gguf
13
+ !Janus-27B.*.gguf
14
  *.safetensors
15
  *.bin
16
 
CHANGELOG.md CHANGED
@@ -7,6 +7,19 @@ and documentation**, not the underlying base model.
7
 
8
  ## [Unreleased]
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ### Fixed
11
  - `Modelfile`: ship a `TEMPLATE` directive mirroring Qwen 3.6 ChatML in
12
  Ollama Go-template form, so Ollama's tool-capability detector sees
 
7
 
8
  ## [Unreleased]
9
 
10
+ ### Added
11
+ - `Janus-27B.Q4_K_M.gguf` (~17 GB) shipped in the repo so HF's
12
+ "Use this model" widget surfaces the Ollama snippet (`ollama run
13
+ hf.co/FoolDev/janus-27b`) and Ollama can pull weights + Modelfile in
14
+ one step. Q4_K_M for parity with the 35B sibling and the README's
15
+ default. Adds `gguf` to the model-card tags so HF lists the repo
16
+ among GGUF models. Loosens `.gitignore` from `*.gguf` to allow
17
+ `Janus-27B.*.gguf` while still excluding the upstream
18
+ `Qwen3.6-27B-*.gguf` that `make build` fetches locally. README's
19
+ "does not redistribute weights" wording reworked: the wrapper
20
+ philosophy still holds for the upstream Qwen GGUFs, but we ship a
21
+ single tagged distribution to back the HF/Ollama UX.
22
+
23
  ### Fixed
24
  - `Modelfile`: ship a `TEMPLATE` directive mirroring Qwen 3.6 ChatML in
25
  Ollama Go-template form, so Ollama's tool-capability detector sees
Janus-27B.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ed60d0af4650a854b1755bd392f9aef4872643dc25a254bc68043fa638392a0
3
+ size 16817244384
README.md CHANGED
@@ -38,6 +38,7 @@ tags:
38
  - conversational
39
  - multimodal
40
  - agent
 
41
  library_name: transformers
42
  pipeline_tag: image-text-to-text
43
  ---
@@ -60,18 +61,20 @@ A personal sibling to [`FoolDev/janus`](https://huggingface.co/FoolDev/janus). S
60
 
61
  ## TL;DR
62
 
63
- If you have Ollama and 24 GB of RAM (or a 24 GB GPU):
 
64
 
65
  ```bash
66
- git clone https://huggingface.co/FoolDev/janus-27b && cd janus-27b
67
- make build # downloads ~17 GB GGUF and creates the model
68
- ollama run janus-27b
69
  ```
70
 
71
- On tighter memory budgets, pass a smaller quant:
72
 
73
  ```bash
74
- make build QUANT=Q3_K_S # ~12 GB GGUF
 
 
 
75
  ```
76
 
77
  For image input use llama.cpp directly — Ollama vision is broken for
@@ -116,7 +119,12 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
116
  | `CHANGELOG.md` | Versioned tooling/docs changes |
117
  | `README.md` | This file |
118
 
119
- This repo does **not** redistribute weights. Pull the upstream GGUF from [`unsloth/Qwen3.6-27B-GGUF`](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF) or any other community quant, point the Modelfile at it, and `ollama create janus-27b -f Modelfile`.
 
 
 
 
 
120
 
121
  If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B).
122
 
@@ -137,9 +145,15 @@ If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-2
137
 
138
  ## Quick start
139
 
140
- ### Ollama (one-liner)
 
 
141
 
142
  ```bash
 
 
 
 
143
  make build # Q4_K_M -> janus-27b
144
  make build QUANT=Q3_K_S # smaller quant
145
  make build GGUF_PATH=~/models/Qwen3.6-27B-Q4_K_M.gguf # skip download
 
38
  - conversational
39
  - multimodal
40
  - agent
41
+ - gguf
42
  library_name: transformers
43
  pipeline_tag: image-text-to-text
44
  ---
 
61
 
62
  ## TL;DR
63
 
64
+ One-liner via Hugging Face (pulls the 17 GB Q4_K_M GGUF + this repo's
65
+ Modelfile, including the tool-calling TEMPLATE):
66
 
67
  ```bash
68
+ ollama run hf.co/FoolDev/janus-27b
 
 
69
  ```
70
 
71
+ Or build locally if you want a different quant:
72
 
73
  ```bash
74
+ git clone https://huggingface.co/FoolDev/janus-27b && cd janus-27b
75
+ make build # downloads ~17 GB Q4_K_M GGUF
76
+ make build QUANT=Q3_K_S # smaller ~12 GB quant
77
+ ollama run janus-27b
78
  ```
79
 
80
  For image input use llama.cpp directly — Ollama vision is broken for
 
119
  | `CHANGELOG.md` | Versioned tooling/docs changes |
120
  | `README.md` | This file |
121
 
122
+ This repo ships **one** GGUF `Janus-27B.Q4_K_M.gguf` (~17 GB) to
123
+ back the HF/Ollama "Use this model" widget (`ollama run
124
+ hf.co/FoolDev/janus-27b`). For other quants or local builds, pull from
125
+ [`unsloth/Qwen3.6-27B-GGUF`](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF)
126
+ and `make build QUANT=...` — the Modelfile here is the same one Ollama
127
+ applies in either path.
128
 
129
  If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B).
130
 
 
145
 
146
  ## Quick start
147
 
148
+ ### Ollama
149
+
150
+ Two paths:
151
 
152
  ```bash
153
+ # A. Pull straight from HF (uses the bundled Q4_K_M + this Modelfile):
154
+ ollama run hf.co/FoolDev/janus-27b
155
+
156
+ # B. Build locally (lets you pick the quant):
157
  make build # Q4_K_M -> janus-27b
158
  make build QUANT=Q3_K_S # smaller quant
159
  make build GGUF_PATH=~/models/Qwen3.6-27B-Q4_K_M.gguf # skip download