FoolDev Claude Opus 4.7 commited on
Commit
1c88b41
Β·
1 Parent(s): 964e418

docs: align README + CHANGELOG with the qwen35 re-stamp (964e418)

Browse files

964e418 (HF-direct `hf upload`) re-stamped the bundled
Thanatos-27B.Q4_K_M.gguf from `general.architecture: 'qwen36'`
back to `'qwen35'`, byte-identical tensors. This commit catches
the docs up:

- README "Heads up" callout removed. It warned about a load
failure that no longer exists, so it was actively misleading.
- README TL;DR collapsed from "three paths around it" back to a
single one-liner. `ollama run hf.co/FoolDev/Thanatos-27B` now
loads on stock Ollama. `make heal-hf` kept as a one-sentence
callout for anyone who already pulled the v0.6.0 (qwen36)
bundle and has the broken blob in their store.
- CHANGELOG [Unreleased] gets a Changed entry documenting the
re-flip + the new bundle blob sha
(5ed60d0af4650a854b1755bd392f9aef4872643dc25a254bc68043fa638392a0,
same hash `make load-bundle` / `make heal-hf` have been
producing locally all session).

Intentionally left stale this commit (follow-up):

- README "Architecture" section still has the long qwen36
narrative + "upstream-may-never-adopt" subagent note.
- Quick start Ollama section still has option A "fails today"
warning.
- examples/README still leads with `make heal-hf` after pull.
- Modelfile preamble (if any) qwen36 mentions.

Those don't actively mislead β€” the workarounds still work and
are still wired up β€” they're just over-prominent for a model
that now loads straight. A separate pass can collapse them.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (2) hide show
  1. CHANGELOG.md +35 -0
  2. README.md +8 -36
CHANGELOG.md CHANGED
@@ -7,6 +7,41 @@ and documentation**, not the underlying base model.
7
 
8
  ## [Unreleased]
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ### Added
11
  - `scripts/check.sh` now greps for the `VAR="$(cmd 2>/dev/null | filter)"`
12
  silent-exit pattern that bit `heal_hf_pull.sh` (commit 385ed94).
 
7
 
8
  ## [Unreleased]
9
 
10
+ ### Changed
11
+ - **Bundle re-stamped `general.architecture: 'qwen36'` β†’ `'qwen35'`**
12
+ in commit 964e418 (HF-direct `hf upload`). Reverses the v0.6.0
13
+ `qwen36` stamp from e1f78fa, which itself flipped qwen35 β†’ qwen36
14
+ deliberately. The proximate trigger: today's subagent recheck
15
+ found no PR or tracking issue for a `qwen36` arch entry in either
16
+ `ggml-org/llama.cpp` or `ollama/ollama`, and the actual Qwen 3.6
17
+ family upstream already loads under `qwen35moe` β€” so the qwen36
18
+ stamp was going to stay ahead-of-upstream indefinitely, not for a
19
+ release or two. Day-to-day cost (every fresh
20
+ `ollama run hf.co/FoolDev/Thanatos-27B` hitting a 500, every user
21
+ needing `make heal-hf` to recover) outweighed the
22
+ architectural-honesty benefit. Tensor data stays byte-identical;
23
+ only the `general.architecture` KV (and the namespaced KV keys)
24
+ flips. New bundle blob SHA is
25
+ `5ed60d0af4650a854b1755bd392f9aef4872643dc25a254bc68043fa638392a0`
26
+ β€” the same hash `make load-bundle` and `make heal-hf` have been
27
+ producing locally this whole session.
28
+ - README "Heads up" callout (qwen36 load-failure warning) removed.
29
+ TL;DR collapsed back from "three paths around it" to a single
30
+ one-liner: `ollama run hf.co/FoolDev/Thanatos-27B` now loads on
31
+ stock Ollama. `make heal-hf` is mentioned as the recovery path
32
+ for anyone who pulled the v0.6.0 bundle and still has the qwen36
33
+ blob in their store.
34
+
35
+ ### Note
36
+ - `make heal-hf`, `make load-bundle`, and `scripts/rename_arch.py`
37
+ stay in the repo and stay useful. heal-hf is idempotent on qwen35
38
+ (skips early); load-bundle does a no-op rebadge on a qwen35 bundle
39
+ and runs `ollama create` as normal. The deeper Architecture /
40
+ Quick start / examples-README narrative still talks about the
41
+ qwen36 stamp and the workaround dance; left intentionally stale
42
+ in this commit for follow-up cleanup, since the immediate user
43
+ pain (load 500 on fresh pull) is what the upload commit fixed.
44
+
45
  ### Added
46
  - `scripts/check.sh` now greps for the `VAR="$(cmd 2>/dev/null | filter)"`
47
  silent-exit pattern that bit `heal_hf_pull.sh` (commit 385ed94).
README.md CHANGED
@@ -61,16 +61,6 @@ pipeline_tag: image-text-to-text
61
 
62
  A personal sibling to [`FoolDev/Janus-35B`](https://huggingface.co/FoolDev/Janus-35B). Same teacher (Claude Opus 4.7), same dataset family, but built on the **dense** [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B) base instead of the 35B-A3B MoE. Smaller, easier to deploy, no expert-routing surprises.
63
 
64
- > ⚠️ **Heads up β€” bundle is stamped `qwen36`.** As of 2026-05-19 the
65
- > bundled GGUF declares `general.architecture: 'qwen36'`, which no
66
- > released llama.cpp / Ollama recognizes yet. `ollama run
67
- > hf.co/FoolDev/Thanatos-27B` and `llama-server -m
68
- > Thanatos-27B.Q4_K_M.gguf` both fail today with `unknown model
69
- > architecture: 'qwen36'`. Workaround in one line:
70
- > `git clone … && cd Thanatos-27B && make load-bundle && ollama run
71
- > thanatos-27b`. Details in [Architecture](#architecture). Once
72
- > upstream ships qwen36 the workaround disappears.
73
-
74
  ## TL;DR
75
 
76
  One-liner via Hugging Face (pulls a GGUF + this repo's root-level
@@ -79,40 +69,22 @@ template β€” HF's Ollama bridge ingests those three files, not
79
  `Modelfile`):
80
 
81
  ```bash
82
- ollama run hf.co/FoolDev/Thanatos-27B # ~17 GB Q4_K_M, qwen36-stamped (see Heads-up above)
83
  ```
84
 
85
- That command fails today with `unknown model architecture: 'qwen36'`
86
- because the bundle is qwen36-stamped. Three paths around it (all
87
- require this repo cloned):
 
 
 
88
 
89
  ```bash
90
  git clone https://huggingface.co/FoolDev/Thanatos-27B && cd Thanatos-27B
91
-
92
- # A. Already ran the broken pull? Heal it in place β€” rebadges the
93
- # already-downloaded blob's arch metadata + rewrites the manifest
94
- # digest so `ollama run hf.co/FoolDev/Thanatos-27B` loads:
95
- make heal-hf
96
- ollama run hf.co/FoolDev/Thanatos-27B
97
-
98
- # B. Haven't pulled yet β€” load *this repo's* qwen36-stamped bundle
99
- # via the rebadge helper (smudges LFS if needed, rebadges
100
- # qwen36 β†’ qwen35, runs `ollama create thanatos-27b`):
101
- make load-bundle
102
- ollama run thanatos-27b
103
-
104
- # C. Bypass the bundle: download a qwen35-stamped GGUF from unsloth
105
- # and build locally. Loads on every current llama.cpp / Ollama.
106
- make build # Q4_K_M from unsloth
107
- make build QUANT=Q3_K_S # 12 GB smaller quant
108
- make build QUANT=Q5_K_M # 20 GB higher quality
109
  ollama run thanatos-27b
110
  ```
111
 
112
- Once upstream adds the qwen36 arch entry, all three paths collapse
113
- to the direct `ollama run hf.co/FoolDev/Thanatos-27B` one-liner
114
- above.
115
-
116
  For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.), `make build
117
  QUANT=Q3_K_S` is the simplest path. See [Quick start](#quick-start)
118
  below for the full matrix.
 
61
 
62
  A personal sibling to [`FoolDev/Janus-35B`](https://huggingface.co/FoolDev/Janus-35B). Same teacher (Claude Opus 4.7), same dataset family, but built on the **dense** [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B) base instead of the 35B-A3B MoE. Smaller, easier to deploy, no expert-routing surprises.
63
 
 
 
 
 
 
 
 
 
 
 
64
  ## TL;DR
65
 
66
  One-liner via Hugging Face (pulls a GGUF + this repo's root-level
 
69
  `Modelfile`):
70
 
71
  ```bash
72
+ ollama run hf.co/FoolDev/Thanatos-27B # ~17 GB Q4_K_M, qwen35-stamped, loads on stock Ollama
73
  ```
74
 
75
+ If you pulled the bundle while it was qwen36-stamped (v0.6.0
76
+ through the early hours of 2026-05-19) the load will still 500 on
77
+ that stale blob β€” `make heal-hf` rebadges it in place. Fresh
78
+ pulls after this fix go straight through.
79
+
80
+ For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.):
81
 
82
  ```bash
83
  git clone https://huggingface.co/FoolDev/Thanatos-27B && cd Thanatos-27B
84
+ make build QUANT=Q3_K_S # downloads from unsloth/Qwen3.6-27B-GGUF
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  ollama run thanatos-27b
86
  ```
87
 
 
 
 
 
88
  For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.), `make build
89
  QUANT=Q3_K_S` is the simplest path. See [Quick start](#quick-start)
90
  below for the full matrix.