FoolDev Claude Opus 4.7 commited on
Commit
d87bc64
·
1 Parent(s): ef3c5d9

feat: make heal-hf (rebadge a qwen36 hf.co/... pull in place)

Browse files

`ollama run hf.co/FoolDev/Thanatos-27B` fails with the qwen36 500
(`unable to load model: <blob>`), and the recovery so far has been
`ollama rm <tag>` followed by `make load-bundle` to build a separate
`thanatos-27b` tag. That works but leaves the canonical
`hf.co/FoolDev/Thanatos-27B` name in a broken state and forces every
caller to use a different tag — easy to forget, easy to re-hit when
muscle memory types the HF form.

`scripts/heal_hf_pull.sh` rebadges the already-pulled blob in store
(qwen36 -> qwen35, metadata-only, byte-identical tensors via
`scripts/rename_arch.py`) and rewrites the manifest's model-layer
digest to point at the new blob. After the heal, the same
`hf.co/FoolDev/Thanatos-27B` tag loads via stock Ollama. Wired via
`make heal-hf`.

The script:
1. Resolves the model blob and manifest path. Uses `ollama show
--modelfile <tag>` to read the FROM line — robust across the
case variants ollama preserves (the lowercase `thanatos-27b`
pull and the canonical `Thanatos-27B` pull register under
different manifest dirs).
2. Inspects general.architecture via gguf.GGUFReader. Skips
idempotently if already qwen35 / qwen35moe; refuses anything
else.
3. Runs scripts/rename_arch.py qwen36 -> qwen35 into
${ROOT}/.cache/thanatos-heal.<rand>.gguf. .cache/ rather than
/tmp because the rebadged copy is ~17 GB — a half-RAM tmpfs
/tmp blows up partway through (errno 50 on Arch with 32 GB
RAM). .cache/ is on the same filesystem as ~/.ollama on a
normal Linux home layout, so the final `mv` into blobs/ stays
an atomic same-filesystem rename.
4. Computes the rebadged blob's sha256 and either moves it into
${OLLAMA_MODELS}/blobs/sha256-<new> or — if a blob with that
hash already exists (e.g. from a prior `make load-bundle` run
against the same bundle) — reuses it without double-allocating
~17 GB. Content-addressed dedup means the second qwen36 -> qwen35
rebadge in a session is free.
5. Rewrites the manifest's model-layer digest + size via jq into a
temp JSON, sanity-checks the rewrite, then atomically moves it
into place over the original manifest.
6. Removes the old qwen36 blob if no other manifest references it.

Verified end-to-end on this box: pulled `ollama run
hf.co/FoolDev/thanatos-27b:Q4_K_M` (fails with qwen36 500), ran
`make heal-hf`, dedup-reused an existing qwen35 blob from a prior
load-bundle, manifest rewrite landed, `MODEL=hf.co/FoolDev/thanatos-27b:Q4_K_M
make smoke-tools` passes (round-trip OK, no token leakage, tool-call
round-trip emits name=get_weather city=Tokyo). Old qwen36 blob was
removed since no other tag referenced it.

README TL;DR Ollama section now lists three paths instead of two
(heal-hf for the already-pulled case, load-bundle for the
fresh-from-this-repo's-bundle case, build for the unsloth qwen35
alternative). New `scripts/heal_hf_pull.sh` row added to "What's
here". CHANGELOG entry at top of [Unreleased].

Once upstream adds the qwen36 arch entry, this script (and the
whole rebadge dance) can be deleted; the bundle works as-is.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (4) hide show
  1. CHANGELOG.md +23 -0
  2. Makefile +4 -1
  3. README.md +19 -9
  4. scripts/heal_hf_pull.sh +173 -0
CHANGELOG.md CHANGED
@@ -8,6 +8,29 @@ and documentation**, not the underlying base model.
8
  ## [Unreleased]
9
 
10
  ### Added
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  - `scripts/load_bundle.sh` + `make load-bundle`: one-shot path from
12
  the qwen36-stamped bundle → loadable Ollama tag. Handles the LFS
13
  smudge (`hf download FoolDev/Thanatos-27B Thanatos-27B.Q4_K_M.gguf`
 
8
  ## [Unreleased]
9
 
10
  ### Added
11
+ - `scripts/heal_hf_pull.sh` + `make heal-hf`: heal an already-pulled
12
+ `hf.co/FoolDev/Thanatos-27B:...` tag in-store by rebadging its
13
+ model blob (qwen36 → qwen35, metadata-only, byte-identical
14
+ tensors) and rewriting the manifest's model-layer digest. Covers
15
+ the user pain when `ollama run hf.co/FoolDev/Thanatos-27B` is
16
+ typed from muscle memory, fails with the qwen36 500, and leaves
17
+ ~17 GB of unloadable blob sitting in the store; before this, the
18
+ only recovery was `ollama rm <tag>` + switching to the separate
19
+ `thanatos-27b` tag that `make load-bundle` builds. `make heal-hf`
20
+ makes the same `hf.co/...` tag loadable in place. Idempotent
21
+ (tags already on qwen35 / qwen35moe are skipped);
22
+ content-addressed dedup means if the rebadged blob already exists
23
+ in the store (e.g. from a prior `make load-bundle` run) the heal
24
+ reuses it instead of double-allocating ~17 GB. Removes the old
25
+ qwen36 blob if no other manifest references it. Stages the
26
+ rebadge in `.cache/` rather than `/tmp` so the ~17 GB write
27
+ doesn't blow past tmpfs (`mv` into `blobs/` stays an atomic
28
+ same-filesystem rename on a normal Linux home-dir layout).
29
+ - README TL;DR Ollama section now lists **three** paths: heal an
30
+ already-pulled HF tag (`make heal-hf`), build from the bundle
31
+ (`make load-bundle`), or bypass the bundle entirely
32
+ (`make build`). New `scripts/heal_hf_pull.sh` entry added to the
33
+ "What's here" table.
34
  - `scripts/load_bundle.sh` + `make load-bundle`: one-shot path from
35
  the qwen36-stamped bundle → loadable Ollama tag. Handles the LFS
36
  smudge (`hf download FoolDev/Thanatos-27B Thanatos-27B.Q4_K_M.gguf`
Makefile CHANGED
@@ -26,7 +26,7 @@ MODEL ?= $(TAG)
26
 
27
  PRECISION ?= F16
28
 
29
- .PHONY: help build load-bundle smoke smoke-tools bench check hooks mmproj clean
30
 
31
  help: ## Show this help.
32
  @awk 'BEGIN {FS = ":.*##"; printf "Targets:\n"} /^[a-zA-Z_-]+:.*?##/ { printf " \033[36m%-12s\033[0m %s\n", $$1, $$2 }' $(MAKEFILE_LIST)
@@ -43,6 +43,9 @@ build: ## Download qwen35-stamped GGUF from unsloth and run 'ollama create' (lo
43
  load-bundle: ## Load THIS repo's qwen36-stamped bundle (smudge LFS + rebadge to qwen35 + ollama create).
44
  TAG=$(TAG) ./scripts/load_bundle.sh
45
 
 
 
 
46
  smoke: ## Verify the model is reachable and round-trips.
47
  MODEL=$(MODEL) ./scripts/smoke_test.sh
48
 
 
26
 
27
  PRECISION ?= F16
28
 
29
+ .PHONY: help build load-bundle heal-hf smoke smoke-tools bench check hooks mmproj clean
30
 
31
  help: ## Show this help.
32
  @awk 'BEGIN {FS = ":.*##"; printf "Targets:\n"} /^[a-zA-Z_-]+:.*?##/ { printf " \033[36m%-12s\033[0m %s\n", $$1, $$2 }' $(MAKEFILE_LIST)
 
43
  load-bundle: ## Load THIS repo's qwen36-stamped bundle (smudge LFS + rebadge to qwen35 + ollama create).
44
  TAG=$(TAG) ./scripts/load_bundle.sh
45
 
46
+ heal-hf: ## Heal an already-pulled hf.co/FoolDev/Thanatos-27B tag in-store (rebadge blob + manifest digest).
47
+ ./scripts/heal_hf_pull.sh
48
+
49
  smoke: ## Verify the model is reachable and round-trips.
50
  MODEL=$(MODEL) ./scripts/smoke_test.sh
51
 
README.md CHANGED
@@ -83,26 +83,35 @@ ollama run hf.co/FoolDev/Thanatos-27B # ~17 GB Q4_K_M, qwen36-stamped
83
  ```
84
 
85
  That command fails today with `unknown model architecture: 'qwen36'`
86
- because the bundle is qwen36-stamped. Two paths around it (both
87
- clone the repo first):
88
 
89
  ```bash
90
  git clone https://huggingface.co/FoolDev/Thanatos-27B && cd Thanatos-27B
91
 
92
- # A. Load *this repo's* qwen36-stamped bundle (smudges LFS if needed,
93
- # rebadges to qwen35, runs `ollama create thanatos-27b`):
 
 
 
 
 
 
 
94
  make load-bundle
95
  ollama run thanatos-27b
96
 
97
- # B. Bypass the bundle entirely: download a qwen35-stamped GGUF from
98
- # unsloth (loads on every current llama.cpp / Ollama):
99
  make build # Q4_K_M from unsloth
100
- make build QUANT=Q5_K_M # higher quality
 
101
  ollama run thanatos-27b
102
  ```
103
 
104
- Once upstream adds the qwen36 arch entry, both paths collapse to the
105
- direct `ollama run hf.co/FoolDev/Thanatos-27B` one-liner above.
 
106
 
107
  For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.), `make build
108
  QUANT=Q3_K_S` is the simplest path. See [Quick start](#quick-start)
@@ -142,6 +151,7 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
142
  | `examples/` | Ready-to-run Python clients for Ollama, Transformers, and llama-cpp-python |
143
  | `scripts/build.sh` | Pulls a qwen35-stamped GGUF from `unsloth/Qwen3.6-27B-GGUF` and runs `ollama create` (loads on today's llama.cpp / Ollama; see `make build`) |
144
  | `scripts/load_bundle.sh` | One-shot path from *this repo's* qwen36-stamped bundle → loadable Ollama tag (smudges LFS pointer via `hf download` if needed, rebadges qwen36 → qwen35, runs `ollama create`; see `make load-bundle`) |
 
145
  | `scripts/smoke_test.sh` | Verifies an Ollama daemon + model, runs a round-trip, asserts no chat-template tokens leak into the response. With `TOOLS_TEST=1`, also exercises an end-to-end tool-call round-trip and checks the response shape |
146
  | `scripts/bench.sh` | Measures real tok/s using Ollama's `eval_count` / `eval_duration` metadata over a 3-prompt mix (run `make bench`) |
147
  | `scripts/fetch_vision.sh` | Pulls the vision projector (`mmproj-F16.gguf`) for llama.cpp (Ollama vision is broken upstream — see [Vision](#vision)). Renamed from `fetch_mmproj.sh` because HF's Ollama bridge auto-indexed the script as a vision projector layer (filename pattern match). |
 
83
  ```
84
 
85
  That command fails today with `unknown model architecture: 'qwen36'`
86
+ because the bundle is qwen36-stamped. Three paths around it (all
87
+ require this repo cloned):
88
 
89
  ```bash
90
  git clone https://huggingface.co/FoolDev/Thanatos-27B && cd Thanatos-27B
91
 
92
+ # A. Already ran the broken pull? Heal it in place — rebadges the
93
+ # already-downloaded blob's arch metadata + rewrites the manifest
94
+ # digest so `ollama run hf.co/FoolDev/Thanatos-27B` loads:
95
+ make heal-hf
96
+ ollama run hf.co/FoolDev/Thanatos-27B
97
+
98
+ # B. Haven't pulled yet — load *this repo's* qwen36-stamped bundle
99
+ # via the rebadge helper (smudges LFS if needed, rebadges
100
+ # qwen36 → qwen35, runs `ollama create thanatos-27b`):
101
  make load-bundle
102
  ollama run thanatos-27b
103
 
104
+ # C. Bypass the bundle: download a qwen35-stamped GGUF from unsloth
105
+ # and build locally. Loads on every current llama.cpp / Ollama.
106
  make build # Q4_K_M from unsloth
107
+ make build QUANT=Q3_K_S # 12 GB smaller quant
108
+ make build QUANT=Q5_K_M # 20 GB higher quality
109
  ollama run thanatos-27b
110
  ```
111
 
112
+ Once upstream adds the qwen36 arch entry, all three paths collapse
113
+ to the direct `ollama run hf.co/FoolDev/Thanatos-27B` one-liner
114
+ above.
115
 
116
  For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.), `make build
117
  QUANT=Q3_K_S` is the simplest path. See [Quick start](#quick-start)
 
151
  | `examples/` | Ready-to-run Python clients for Ollama, Transformers, and llama-cpp-python |
152
  | `scripts/build.sh` | Pulls a qwen35-stamped GGUF from `unsloth/Qwen3.6-27B-GGUF` and runs `ollama create` (loads on today's llama.cpp / Ollama; see `make build`) |
153
  | `scripts/load_bundle.sh` | One-shot path from *this repo's* qwen36-stamped bundle → loadable Ollama tag (smudges LFS pointer via `hf download` if needed, rebadges qwen36 → qwen35, runs `ollama create`; see `make load-bundle`) |
154
+ | `scripts/heal_hf_pull.sh` | Heal an already-pulled `hf.co/FoolDev/Thanatos-27B:...` tag in-store: rebadges its model blob qwen36 → qwen35 and rewrites the manifest's model-layer digest so the same tag becomes loadable in place. Use after `ollama run hf.co/FoolDev/Thanatos-27B` has failed once and left ~17 GB in the blob store; see `make heal-hf`. Idempotent — tags already on qwen35 are skipped. |
155
  | `scripts/smoke_test.sh` | Verifies an Ollama daemon + model, runs a round-trip, asserts no chat-template tokens leak into the response. With `TOOLS_TEST=1`, also exercises an end-to-end tool-call round-trip and checks the response shape |
156
  | `scripts/bench.sh` | Measures real tok/s using Ollama's `eval_count` / `eval_duration` metadata over a 3-prompt mix (run `make bench`) |
157
  | `scripts/fetch_vision.sh` | Pulls the vision projector (`mmproj-F16.gguf`) for llama.cpp (Ollama vision is broken upstream — see [Vision](#vision)). Renamed from `fetch_mmproj.sh` because HF's Ollama bridge auto-indexed the script as a vision projector layer (filename pattern match). |
scripts/heal_hf_pull.sh ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Thanatos-27B — heal a freshly pulled HF-bridge tag whose bundled GGUF
3
+ # is `qwen36`-stamped.
4
+ #
5
+ # Background. `ollama run hf.co/FoolDev/Thanatos-27B` (or any other
6
+ # qwen36-stamped HF-bridge tag of this repo) pulls a fresh copy of the
7
+ # bundled GGUF every time. Until upstream registers the `qwen36` arch,
8
+ # every such pull fails with `unable to load model: <blob>` (see
9
+ # README "Architecture"). `make load-bundle` works around this by
10
+ # building a *separate* local `thanatos-27b` tag from a rebadged copy,
11
+ # but the canonical HF-bridge tag stays broken.
12
+ #
13
+ # This script rebadges the HF-bridge tag's model blob in-place
14
+ # (qwen36 -> qwen35, metadata-only, byte-identical tensors) and
15
+ # rewrites the manifest's model-layer digest to point at the new
16
+ # blob. After running it, `ollama run hf.co/FoolDev/Thanatos-27B`
17
+ # loads.
18
+ #
19
+ # Idempotent: a tag already on qwen35 / qwen35moe is left untouched.
20
+ # Re-runnable after a fresh HF pull (the pull resets the manifest
21
+ # digest back to the qwen36 blob).
22
+ #
23
+ # Once upstream adds the qwen36 arch entry this script (and the
24
+ # whole rebadge dance) can be deleted; the bundle works as-is.
25
+ #
26
+ # Usage:
27
+ # ./scripts/heal_hf_pull.sh # default tag
28
+ # TAG=hf.co/FoolDev/Thanatos-27B:Q4_K_M ./scripts/heal_hf_pull.sh
29
+ #
30
+ # Requires: ollama, jq, python3 with the `gguf` package, sha256sum.
31
+ set -euo pipefail
32
+
33
+ ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
34
+ TAG="${TAG:-hf.co/FoolDev/Thanatos-27B:Q4_K_M}"
35
+ OLLAMA_MODELS="${OLLAMA_MODELS:-${HOME}/.ollama/models}"
36
+
37
+ red() { printf "\033[31m%s\033[0m\n" "$*"; }
38
+ green() { printf "\033[32m%s\033[0m\n" "$*"; }
39
+ blue() { printf "\033[34m%s\033[0m\n" "$*"; }
40
+
41
+ blue "[*] tag: ${TAG}"
42
+ blue "[*] store: ${OLLAMA_MODELS}"
43
+
44
+ # ---- 1. Sanity ---------------------------------------------------------------
45
+
46
+ for bin in ollama jq python3 sha256sum; do
47
+ if ! command -v "${bin}" >/dev/null 2>&1; then
48
+ red "[!] missing dependency: ${bin}"; exit 1
49
+ fi
50
+ done
51
+
52
+ # ---- 2. Locate the model blob and manifest ----------------------------------
53
+
54
+ # `ollama show --modelfile` writes a FROM line with the absolute blob path.
55
+ # Reliable regardless of which case variant the user pulled with
56
+ # (hf.co's 307 lets `Thanatos-27B` and `thanatos-27b` both resolve to the
57
+ # canonical repo, and ollama stores the manifest under whichever case
58
+ # was first registered).
59
+ MODEL_BLOB="$(ollama show --modelfile "${TAG}" 2>/dev/null | awk '/^FROM[[:space:]]/ {print $2; exit}')"
60
+ if [[ -z "${MODEL_BLOB}" || ! -f "${MODEL_BLOB}" ]]; then
61
+ red "[!] could not resolve model blob for tag '${TAG}'."
62
+ red " Is the tag pulled? Try: ollama pull ${TAG}"
63
+ exit 1
64
+ fi
65
+ MODEL_HASH="$(basename "${MODEL_BLOB}" | sed 's/^sha256-//')"
66
+ blue "[*] blob: ${MODEL_BLOB}"
67
+
68
+ # Find the manifest by grepping for the model digest. The blob is
69
+ # referenced from exactly one tag in the heal scenario — fresh HF pull
70
+ # of a single :Q4_K_M tag — but if someone has multiple tags pointing
71
+ # at the same blob, we filter down to the one matching ${TAG}.
72
+ TAG_PATH="${TAG#hf.co/}" # FoolDev/Thanatos-27B:Q4_K_M
73
+ NAMESPACE_PATH="${TAG_PATH%:*}" # FoolDev/Thanatos-27B
74
+ TAG_FILE="${TAG_PATH##*:}" # Q4_K_M
75
+
76
+ MANIFEST="$(find "${OLLAMA_MODELS}/manifests/hf.co" \
77
+ -type f \
78
+ -ipath "*/${NAMESPACE_PATH}/${TAG_FILE}" 2>/dev/null | head -1)"
79
+
80
+ if [[ -z "${MANIFEST}" || ! -f "${MANIFEST}" ]]; then
81
+ red "[!] manifest not found under ${OLLAMA_MODELS}/manifests/hf.co for tag '${TAG}'."
82
+ exit 1
83
+ fi
84
+ blue "[*] manifest: ${MANIFEST}"
85
+
86
+ # ---- 3. Inspect arch ---------------------------------------------------------
87
+
88
+ ARCH="$(python3 - "${MODEL_BLOB}" <<'PY'
89
+ import sys
90
+ from gguf import GGUFReader, constants
91
+ r = GGUFReader(sys.argv[1], "r")
92
+ f = r.get_field(constants.Keys.General.ARCHITECTURE)
93
+ print(bytes(f.parts[f.data[0]]).decode())
94
+ PY
95
+ )"
96
+ blue "[*] arch: ${ARCH}"
97
+
98
+ if [[ "${ARCH}" == "qwen35" || "${ARCH}" == "qwen35moe" ]]; then
99
+ green "[=] already on a loadable arch (${ARCH}) — nothing to heal."
100
+ exit 0
101
+ fi
102
+ if [[ "${ARCH}" != "qwen36" ]]; then
103
+ red "[!] unexpected arch '${ARCH}' — refusing to heal. Edit this script if intentional."
104
+ exit 1
105
+ fi
106
+
107
+ # ---- 4. Rebadge to a temp blob and stage it in the store --------------------
108
+
109
+ # Stage in the repo's .cache/ rather than /tmp: the rebadged copy is the same
110
+ # size as the original (~17 GB), which blows past a typical tmpfs /tmp budget.
111
+ # .cache/ is on the same filesystem as ~/.ollama on a normal Linux home dir
112
+ # layout, so the final move into blobs/ is an atomic rename, not a copy.
113
+ SCRATCH_DIR="${ROOT}/.cache"
114
+ mkdir -p "${SCRATCH_DIR}"
115
+ TMP_BLOB="$(mktemp -p "${SCRATCH_DIR}" thanatos-heal.XXXXXX.gguf)"
116
+ trap 'rm -f "${TMP_BLOB}"' EXIT
117
+ blue "[*] rebadging qwen36 -> qwen35 (metadata only, tensors byte-identical) ..."
118
+ python3 "${ROOT}/scripts/rename_arch.py" \
119
+ --from-arch qwen36 --to-arch qwen35 \
120
+ "${MODEL_BLOB}" "${TMP_BLOB}"
121
+
122
+ NEW_HASH="$(sha256sum "${TMP_BLOB}" | awk '{print $1}')"
123
+ NEW_SIZE="$(stat -c '%s' "${TMP_BLOB}")"
124
+ NEW_BLOB="${OLLAMA_MODELS}/blobs/sha256-${NEW_HASH}"
125
+ blue "[*] new digest: sha256:${NEW_HASH}"
126
+ blue "[*] new size: ${NEW_SIZE}"
127
+
128
+ if [[ -f "${NEW_BLOB}" ]]; then
129
+ blue "[=] target blob already in store — reusing."
130
+ rm -f "${TMP_BLOB}"
131
+ else
132
+ mv "${TMP_BLOB}" "${NEW_BLOB}"
133
+ fi
134
+ trap - EXIT
135
+
136
+ # ---- 5. Rewrite the manifest's model layer ----------------------------------
137
+
138
+ TMP_MANIFEST="$(mktemp -t thanatos-heal-manifest.XXXXXX.json)"
139
+ trap 'rm -f "${TMP_MANIFEST}"' EXIT
140
+ jq --arg new "sha256:${NEW_HASH}" \
141
+ --argjson size "${NEW_SIZE}" '
142
+ .layers |= map(
143
+ if .mediaType == "application/vnd.ollama.image.model"
144
+ then .digest = $new | .size = $size
145
+ else .
146
+ end
147
+ )
148
+ ' "${MANIFEST}" > "${TMP_MANIFEST}"
149
+
150
+ NEW_DIGEST_IN_MANIFEST="$(jq -r '
151
+ .layers[] | select(.mediaType == "application/vnd.ollama.image.model") | .digest
152
+ ' "${TMP_MANIFEST}")"
153
+ if [[ "${NEW_DIGEST_IN_MANIFEST}" != "sha256:${NEW_HASH}" ]]; then
154
+ red "[!] manifest rewrite failed (digest mismatch); not committing."
155
+ exit 1
156
+ fi
157
+ mv "${TMP_MANIFEST}" "${MANIFEST}"
158
+ trap - EXIT
159
+
160
+ # ---- 6. Remove the old qwen36 blob if no other manifest references it -------
161
+
162
+ OLD_DIGEST="sha256:${MODEL_HASH}"
163
+ if ! grep -rlF -- "${OLD_DIGEST}" "${OLLAMA_MODELS}/manifests/" >/dev/null 2>&1; then
164
+ blue "[*] no other manifest references the old qwen36 blob — removing ${MODEL_BLOB}"
165
+ rm -f "${MODEL_BLOB}"
166
+ else
167
+ blue "[=] old qwen36 blob still referenced by another manifest — leaving in place."
168
+ fi
169
+
170
+ echo
171
+ green "[+] healed. Try it:"
172
+ echo " ollama run ${TAG}"
173
+ echo " MODEL=${TAG} make smoke"