FoolDev commited on
Commit
bc0cbc6
·
1 Parent(s): c843f11

Rename Janus-27B → Thanatos-27B + update artwork

Browse files

Splits the dense 27B off from the Janus family naming. The 35B-A3B MoE
sibling (FoolDev/janus) keeps its name; this repo becomes its own
identity. In-repo rewrite: README, Modelfile, params/system/template,
Makefile, CITATION.cff, .gitignore, examples/, scripts/. Bundled GGUF
git-mv'd Janus-27B.Q4_K_M.gguf → Thanatos-27B.Q4_K_M.gguf (LFS pointer
preserved, no re-upload). Default Ollama tag flips
janus-27b → thanatos-27b. System prompt identity flips
"You are Janus" → "You are Thanatos" everywhere.

Sibling references back to FoolDev/janus and Janus-35B-A3B are
preserved on purpose.

banner.svg wordmark JANUS-27B → THANATOS-27B; font-size dropped 26 → 22
to keep the longer 12-char wordmark inside the same 383×77 viewBox
without encroaching on the activation-grid dots. Tokyo Night palette
unchanged. banner.png regenerated via rsvg-convert. dense-flow.svg has
no wordmark and is left unchanged.

Also adds the top-row Buy Me a Coffee badge (Tokyo Night yellow,
e0af68 on 1a1b26 labelColor) linking to buymeacoffee.com/Thanatos-27B.

HF-side action still required (cannot be done from this repo): rename
the HF repository FoolDev/janus-27b → FoolDev/thanatos-27b in the HF
Settings UI. Existing ollama pull hf.co/FoolDev/janus-27b callers will
404 after that — one-time URL break, unavoidable.

.gitignore CHANGED
@@ -7,10 +7,10 @@ venv/
7
 
8
  # Local model weights. We don't redistribute the upstream Qwen GGUFs
9
  # here — `make build` fetches one from unsloth/Qwen3.6-27B-GGUF locally.
10
- # The single Janus-27B.*.gguf we DO ship backs the HF/Ollama
11
- # "Use this model" widget (ollama run hf.co/FoolDev/janus-27b).
12
  *.gguf
13
- !Janus-27B.*.gguf
14
  *.safetensors
15
  *.bin
16
 
 
7
 
8
  # Local model weights. We don't redistribute the upstream Qwen GGUFs
9
  # here — `make build` fetches one from unsloth/Qwen3.6-27B-GGUF locally.
10
+ # The single Thanatos-27B.*.gguf we DO ship backs the HF/Ollama
11
+ # "Use this model" widget (ollama run hf.co/FoolDev/thanatos-27b).
12
  *.gguf
13
+ !Thanatos-27B.*.gguf
14
  *.safetensors
15
  *.bin
16
 
CHANGELOG.md CHANGED
@@ -7,6 +7,45 @@ and documentation**, not the underlying base model.
7
 
8
  ## [Unreleased]
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ### Removed
11
  - `Janus-27B.Q3_K_S.gguf` no longer redistributed in this repo.
12
  Removing it leaves `Janus-27B.Q4_K_M.gguf` as the only GGUF, which
 
7
 
8
  ## [Unreleased]
9
 
10
+ ### Renamed
11
+ - **Project renamed `Janus-27B` → `Thanatos-27B`.** The 35B-A3B MoE
12
+ sibling (`FoolDev/janus`) keeps the Janus name; this repo splits off
13
+ as its own identity. In-repo rewrite covered: `README.md`,
14
+ `Modelfile`, `params` / `system` / `template`, `Makefile`,
15
+ `CITATION.cff`, `.gitignore`, `examples/**`, `scripts/**`. The
16
+ bundled GGUF was renamed `Janus-27B.Q4_K_M.gguf` →
17
+ `Thanatos-27B.Q4_K_M.gguf` (`git mv`, LFS pointer preserved). Default
18
+ Ollama tag flipped `janus-27b` → `thanatos-27b`. System prompt
19
+ identity flipped "You are Janus" → "You are Thanatos" everywhere
20
+ it appears (`Modelfile`, `system`, `examples/*.py`). Sibling
21
+ references back to `FoolDev/janus` and `Janus-35B-A3B` are
22
+ preserved on purpose — those still point at the MoE.
23
+ - **HF-side action still required (cannot be done from this repo):**
24
+ rename the HF repository `FoolDev/janus-27b` → `FoolDev/thanatos-27b`
25
+ in HF Settings → "Rename or transfer this repository". One-time URL
26
+ break is unavoidable; existing
27
+ `ollama pull hf.co/FoolDev/janus-27b` callers will 404 after.
28
+
29
+ ### Changed
30
+ - `banner.svg` wordmark `JANUS-27B` → `THANATOS-27B`. Font-size dropped
31
+ 26 → 22 to keep the longer 12-char wordmark inside the same 383×77
32
+ viewBox without encroaching on the activation-grid dots in the upper
33
+ right. Tokyo Night palette unchanged: `#c0caf5` mark, `#bb9af7`
34
+ highlight on the suffix, gradient backdrop, animated activation grid
35
+ + token-stream beam preserved.
36
+ - `banner.png` regenerated from the updated `banner.svg` via
37
+ `rsvg-convert` at the same 383×77 dimensions.
38
+ - `dense-flow.svg` left unchanged — it has no wordmark, only the
39
+ 64-layer hybrid-attention pulse visualization.
40
+
41
+ ### Added
42
+ - README top-row badge linking to
43
+ [`buymeacoffee.com/Thanatos-27B`](https://buymeacoffee.com/Thanatos-27B).
44
+ Tokyo Night yellow (`e0af68` on `1a1b26` labelColor) so it sits
45
+ alongside the existing License/Base/Arch/Sibling badges without
46
+ visual fight. Tip-jar only — no functional change to the model or
47
+ tooling.
48
+
49
  ### Removed
50
  - `Janus-27B.Q3_K_S.gguf` no longer redistributed in this repo.
51
  Removing it leaves `Janus-27B.Q4_K_M.gguf` as the only GGUF, which
CITATION.cff CHANGED
@@ -1,14 +1,14 @@
1
  cff-version: 1.2.0
2
- title: "Janus-27B: A Dense Distillation Wrapper for Qwen 3.6 27B"
3
  message: "If you use this model card or its accompanying files, please cite as below."
4
  type: software
5
  authors:
6
  - name: FoolDev
7
  website: "https://huggingface.co/FoolDev"
8
- repository-code: "https://huggingface.co/FoolDev/janus-27b"
9
- url: "https://huggingface.co/FoolDev/janus-27b"
10
  abstract: >-
11
- Janus-27B is a personal repackaging of the dense Qwen 3.6 27B base model
12
  with Claude Opus 4.7 in the reasoning teacher slot. The repository ships
13
  an Ollama Modelfile, sampling defaults, usage examples, and a single
14
  ready-to-run GGUF (Q4_K_M ~17 GB) so the HF "Use this model" widget
 
1
  cff-version: 1.2.0
2
+ title: "Thanatos-27B: A Dense Distillation Wrapper for Qwen 3.6 27B"
3
  message: "If you use this model card or its accompanying files, please cite as below."
4
  type: software
5
  authors:
6
  - name: FoolDev
7
  website: "https://huggingface.co/FoolDev"
8
+ repository-code: "https://huggingface.co/FoolDev/thanatos-27b"
9
+ url: "https://huggingface.co/FoolDev/thanatos-27b"
10
  abstract: >-
11
+ Thanatos-27B is a personal repackaging of the dense Qwen 3.6 27B base model
12
  with Claude Opus 4.7 in the reasoning teacher slot. The repository ships
13
  an Ollama Modelfile, sampling defaults, usage examples, and a single
14
  ready-to-run GGUF (Q4_K_M ~17 GB) so the HF "Use this model" widget
Makefile CHANGED
@@ -1,11 +1,11 @@
1
- # Janus-27B convenience Makefile.
2
  #
3
  # All work is delegated to scripts/* — this file just gives common
4
  # operations short, discoverable names.
5
  #
6
  # Variables you can override on the command line:
7
  # QUANT GGUF quant suffix (default: Q4_K_M)
8
- # TAG Ollama model tag (default: janus-27b)
9
  # GGUF_PATH path to existing GGUF (skip the download)
10
  # MODEL model tag for smoke (default: $(TAG))
11
  #
@@ -18,7 +18,7 @@
18
  # make clean
19
 
20
  QUANT ?= Q4_K_M
21
- TAG ?= janus-27b
22
  MODEL ?= $(TAG)
23
 
24
  .DEFAULT_GOAL := help
 
1
+ # Thanatos-27B convenience Makefile.
2
  #
3
  # All work is delegated to scripts/* — this file just gives common
4
  # operations short, discoverable names.
5
  #
6
  # Variables you can override on the command line:
7
  # QUANT GGUF quant suffix (default: Q4_K_M)
8
+ # TAG Ollama model tag (default: thanatos-27b)
9
  # GGUF_PATH path to existing GGUF (skip the download)
10
  # MODEL model tag for smoke (default: $(TAG))
11
  #
 
18
  # make clean
19
 
20
  QUANT ?= Q4_K_M
21
+ TAG ?= thanatos-27b
22
  MODEL ?= $(TAG)
23
 
24
  .DEFAULT_GOAL := help
Modelfile CHANGED
@@ -1,15 +1,15 @@
1
- # Janus-27B — Ollama wrapper around Qwen 3.6 27B (dense)
2
  #
3
  # Text + tool calling. Vision via Ollama is currently broken for this
4
  # architecture (ollama/ollama#15898 — the vendored llama.cpp fork is
5
  # missing the qwen35 arch entries). Use llama.cpp directly for image
6
  # input, or wait for the fix. See the Vision section in README.md.
7
  #
8
- # This repo bundles a single GGUF: Janus-27B.Q4_K_M.gguf (~17 GB).
9
  # The FROM line below points at it, so a fresh clone (with LFS smudge
10
  # enabled) supports the no-script path:
11
  #
12
- # ollama create janus-27b -f Modelfile && ollama run janus-27b
13
  #
14
  # For other quants (Q3_K_S, Q5_K_M, Q6_K, etc.), `make build QUANT=Q3_K_S`
15
  # downloads the chosen quant from unsloth/Qwen3.6-27B-GGUF and patches
@@ -21,7 +21,7 @@
21
  # https://huggingface.co/unsloth/Qwen3.6-27B-GGUF
22
  # https://huggingface.co/rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled-GGUF
23
 
24
- FROM ./Janus-27B.Q4_K_M.gguf
25
 
26
  # Chat template — Qwen 3.6 ChatML in Ollama Go-template form, with the
27
  # tool-calling blocks Ollama's capability detector looks for. Without a
@@ -98,7 +98,7 @@ PARAMETER stop "<|im_end|>"
98
  PARAMETER stop "<|endoftext|>"
99
  PARAMETER stop "<|im_start|>"
100
 
101
- SYSTEM """You are Janus, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
102
 
103
  Behavior rules:
104
  - Answer the user's actual request directly.
 
1
+ # Thanatos-27B — Ollama wrapper around Qwen 3.6 27B (dense)
2
  #
3
  # Text + tool calling. Vision via Ollama is currently broken for this
4
  # architecture (ollama/ollama#15898 — the vendored llama.cpp fork is
5
  # missing the qwen35 arch entries). Use llama.cpp directly for image
6
  # input, or wait for the fix. See the Vision section in README.md.
7
  #
8
+ # This repo bundles a single GGUF: Thanatos-27B.Q4_K_M.gguf (~17 GB).
9
  # The FROM line below points at it, so a fresh clone (with LFS smudge
10
  # enabled) supports the no-script path:
11
  #
12
+ # ollama create thanatos-27b -f Modelfile && ollama run thanatos-27b
13
  #
14
  # For other quants (Q3_K_S, Q5_K_M, Q6_K, etc.), `make build QUANT=Q3_K_S`
15
  # downloads the chosen quant from unsloth/Qwen3.6-27B-GGUF and patches
 
21
  # https://huggingface.co/unsloth/Qwen3.6-27B-GGUF
22
  # https://huggingface.co/rico03/Qwen3.6-27B-Claude-Opus-Reasoning-Distilled-GGUF
23
 
24
+ FROM ./Thanatos-27B.Q4_K_M.gguf
25
 
26
  # Chat template — Qwen 3.6 ChatML in Ollama Go-template form, with the
27
  # tool-calling blocks Ollama's capability detector looks for. Without a
 
98
  PARAMETER stop "<|endoftext|>"
99
  PARAMETER stop "<|im_start|>"
100
 
101
+ SYSTEM """You are Thanatos, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
102
 
103
  Behavior rules:
104
  - Answer the user's actual request directly.
README.md CHANGED
@@ -44,14 +44,15 @@ library_name: transformers
44
  pipeline_tag: image-text-to-text
45
  ---
46
 
47
- <img src="https://huggingface.co/FoolDev/janus-27b/resolve/main/banner.svg" alt="Janus-27B banner" width="100%" />
48
 
49
  [![License](https://img.shields.io/badge/License-Apache_2.0-7aa2f7?style=flat&labelColor=1a1b26)](https://opensource.org/licenses/Apache-2.0)
50
  [![Base Model](https://img.shields.io/badge/Base-Qwen3.6--27B-bb9af7?style=flat&labelColor=1a1b26)](https://huggingface.co/Qwen/Qwen3.6-27B)
51
  [![Architecture](https://img.shields.io/badge/Arch-Dense_27B-ff9e64?style=flat&labelColor=1a1b26)](#architecture)
52
  [![Sibling](https://img.shields.io/badge/Sibling-Janus--35B-7dcfff?style=flat&labelColor=1a1b26)](https://huggingface.co/FoolDev/janus)
 
53
 
54
- # Janus-27B
55
 
56
  > **Dense Reasoning. Friendlier Footprint.**
57
  > *Qwen 3.6 27B (dense) repackaged with Claude Opus 4.7 in the teacher slot.*
@@ -68,7 +69,7 @@ template — HF's Ollama bridge ingests those three files, not
68
  `Modelfile`):
69
 
70
  ```bash
71
- ollama run hf.co/FoolDev/janus-27b # ~17 GB Q4_K_M (the only bundled quant)
72
  ```
73
 
74
  For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.), `make build
@@ -79,10 +80,10 @@ Or build locally (uses this repo's `Modelfile`, kept in sync with the
79
  three bridge files) for any quant:
80
 
81
  ```bash
82
- git clone https://huggingface.co/FoolDev/janus-27b && cd janus-27b
83
- make build # uses the bundled Janus-27B.Q4_K_M.gguf
84
  make build QUANT=Q5_K_M # downloads from unsloth/Qwen3.6-27B-GGUF
85
- ollama run janus-27b
86
  ```
87
 
88
  For image input use llama.cpp directly — Ollama vision is broken for
@@ -94,7 +95,7 @@ The 35B-A3B is a sparse mixture-of-experts model: 35B parameters total but only
94
 
95
  The 27B is **dense**: every parameter participates in every forward pass. It's slower per token than 35B-A3B — on a Ryzen AI Max+ 395 / Radeon 8060S iGPU the dense 27B at Q3_K_S clocks ~10 tok/s, versus ~27 tok/s for the MoE 35B at ~Q4 (`make bench`, 3-prompt mix) — but the working set fits comfortably on commodity GPUs and avoids the MoE-specific load-balance failure modes.
96
 
97
- | | Janus-27B (this) | [Janus-35B](https://huggingface.co/FoolDev/janus) |
98
  |---|---|---|
99
  | Architecture | Dense transformer | MoE 256 experts, 8 active |
100
  | Total params | 27 B | 35 B |
@@ -115,7 +116,7 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
115
  |---|---|
116
  | `banner.svg` / `banner.png` | Repo header, Tokyo Night themed |
117
  | `Modelfile` | Ollama wrapper around the bundled Qwen 3.6 27B GGUF — used by `make build` / `ollama create` for **local** builds |
118
- | `template`, `system`, `params` | Used by HF's Ollama bridge when users `ollama run hf.co/FoolDev/janus-27b` directly (the bridge does **not** read `Modelfile` — see [HF Ollama docs](https://huggingface.co/docs/hub/en/ollama)). Mirrors the `Modelfile`'s template / system prompt / sampling params. |
119
  | `examples/` | Ready-to-run Python clients for Ollama, Transformers, and llama-cpp-python |
120
  | `scripts/build.sh` | One-shot helper: pulls a GGUF and runs `ollama create` for you |
121
  | `scripts/smoke_test.sh` | Verifies an Ollama daemon + model, runs a round-trip, asserts no chat-template tokens leak into the response. With `TOOLS_TEST=1`, also exercises an end-to-end tool-call round-trip and checks the response shape |
@@ -130,15 +131,15 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
130
  | `README.md` | This file |
131
 
132
  This repo ships two GGUFs to back the HF/Ollama "Use this model"
133
- widget — `Janus-27B.Q4_K_M.gguf` (~17 GB):
134
 
135
  ```bash
136
- ollama run hf.co/FoolDev/janus-27b # 17 GB Q4_K_M (only bundled quant)
137
  ```
138
 
139
  For 16 GB GPUs / unified-memory laptops, `make build QUANT=Q3_K_S`
140
  downloads the smaller ~12 GB Q3_K_S quant from `unsloth/Qwen3.6-27B-GGUF`
141
- and creates a local `janus-27b` Ollama tag (does not redistribute via
142
  this repo).
143
 
144
  For other quants or local builds, pull from
@@ -152,7 +153,7 @@ If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-2
152
  ## Architecture
153
 
154
  <p align="left">
155
- <img src="https://huggingface.co/FoolDev/janus-27b/resolve/main/dense-flow.svg" alt="animated dense forward-pass visualization: 64-layer hybrid attention stack with a pulse traversing left-to-right, illuminating Gated DeltaNet (purple) and Gated Attention (cyan) layers in turn" width="800" />
156
  </p>
157
 
158
  - Qwen 3.6 dense, 27B parameters, 64 transformer layers
@@ -177,14 +178,14 @@ Two paths:
177
  ```bash
178
  # A. Pull straight from HF (uses the bundled Q4_K_M + root-level
179
  # template / system / params files):
180
- ollama run hf.co/FoolDev/janus-27b # 17 GB Q4_K_M (only bundled quant)
181
 
182
  # B. Build locally for a different quant (downloads from unsloth):
183
- make build # Q4_K_M -> janus-27b
184
  make build QUANT=Q3_K_S # 12 GB smaller quant
185
  make build QUANT=Q5_K_M # 20 GB higher quality
186
  make build GGUF_PATH=~/models/Qwen3.6-27B-Q4_K_M.gguf # skip download
187
- ollama run janus-27b
188
  ```
189
 
190
  Under the hood, `make build` calls `scripts/build.sh`, which downloads the
@@ -192,7 +193,7 @@ GGUF if missing (set `GGUF_PATH` to point at one you already have) and
192
  runs `ollama create` with the matching `Modelfile`.
193
 
194
  If you'd rather do it by hand: edit the `FROM` line in `Modelfile` and
195
- run `ollama create janus-27b -f Modelfile && ollama run janus-27b`.
196
 
197
  Confirm everything works:
198
 
@@ -205,15 +206,15 @@ python examples/ollama_chat.py # full demo: chat, streaming, tools, OpenAI-
205
 
206
  ### Local apps
207
 
208
- The bundled `Janus-27B.Q4_K_M.gguf` works in any GGUF-compatible local
209
  app — point it at this repo and load.
210
 
211
  | App | How to load this model |
212
  |---|---|
213
- | **Ollama** | `ollama run hf.co/FoolDev/janus-27b` (default Q4_K_M). Pulls the GGUF + the root-level `template` / `system` / `params` files in one step (HF's Ollama bridge ingests these three files; it does **not** read `Modelfile`). For other quants, `make build QUANT=Q3_K_S` downloads from unsloth and creates a local Ollama tag using the `Modelfile`, which is kept in sync with the bridge files. |
214
- | **LM Studio** | Search → `FoolDev/janus-27b` → pick `Janus-27B.Q4_K_M.gguf`. Uses the GGUF's embedded jinja chat template (Qwen 3.6 ChatML); set the system prompt manually from the `SYSTEM` block in this repo's `Modelfile`. |
215
- | **Jan** | Hub → "Import from Hugging Face" → `FoolDev/janus-27b`. Same template behavior as LM Studio. |
216
- | **llama.cpp** | `hf download FoolDev/janus-27b Janus-27B.Q4_K_M.gguf --local-dir .` then `llama-server -m Janus-27B.Q4_K_M.gguf` (or `llama-cli`, `llama-mtmd-cli` for vision via the upstream `mmproj-F16.gguf`). |
217
  | **llama-cpp-python** | See `examples/llama_cpp_quickstart.py` (text) and `examples/llama_cpp_vision.py` (image input). |
218
  | **Open WebUI / KoboldCpp / text-generation-webui** | Standard llama.cpp loader path — point at the GGUF, use the embedded chat template. |
219
 
@@ -231,9 +232,9 @@ external schema.
231
  curl -s http://localhost:11434/v1/chat/completions \
232
  -H 'Content-Type: application/json' \
233
  -d '{
234
- "model": "janus-27b",
235
  "messages": [
236
- {"role": "system", "content": "You are Janus, a precise reasoning assistant."},
237
  {"role": "user", "content": "Explain the Burrows-Wheeler transform in 200 words."}
238
  ],
239
  "temperature": 0.6
@@ -255,7 +256,7 @@ The Modelfile bakes this in. Override per-request via the `system` role
255
  in your client:
256
 
257
  ```text
258
- You are Janus, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
259
 
260
  Behavior rules:
261
  - Answer the user's actual request directly.
@@ -313,7 +314,7 @@ for this model.
313
 
314
  ## Hardware requirements
315
 
316
- The dense 27B is the easier of the two Janus models to deploy.
317
 
318
  | Hardware | Status |
319
  |---|---|
@@ -344,10 +345,10 @@ Ollama is the exception: its conversion of the embedded jinja loses the
344
  `.Tools` / `.ToolCalls` blocks Ollama's capability detector requires.
345
  Two paths fix this, depending on how you pull the model:
346
 
347
- - **`ollama run hf.co/FoolDev/janus-27b`** — HF's Ollama bridge applies
348
  the root-level `template` / `system` / `params` files in this repo
349
  (the bridge does **not** read `Modelfile`).
350
- - **`make build` / `ollama create janus-27b -f Modelfile`** — uses the
351
  `Modelfile`'s `TEMPLATE` block.
352
 
353
  Both routes wire `.Tools` / `.ToolCalls` and tools work end-to-end on
@@ -358,7 +359,7 @@ kept in sync: edit them together if you change one.
358
 
359
  ```text
360
  <|im_start|>system
361
- You are Janus, a precise and capable assistant…<|im_end|>
362
  <|im_start|>user
363
  What is the time complexity of mergesort?<|im_end|>
364
  <|im_start|>assistant
@@ -390,7 +391,7 @@ the model adapts to whichever shape the system prompt prescribes.
390
  **Ollama path** (this repo's `Modelfile`). The `TEMPLATE` directive
391
  prompts the model to emit JSON-in-XML, the form Ollama's tool-call
392
  extractor parses into a structured `tool_calls` array. After
393
- `make build`, `ollama show janus-27b` lists `tools` and `thinking`
394
  under **Capabilities**, and both `/api/chat` and `/v1/chat/completions`
395
  accept a `tools` array.
396
 
 
44
  pipeline_tag: image-text-to-text
45
  ---
46
 
47
+ <img src="https://huggingface.co/FoolDev/thanatos-27b/resolve/main/banner.svg" alt="Thanatos-27B banner" width="100%" />
48
 
49
  [![License](https://img.shields.io/badge/License-Apache_2.0-7aa2f7?style=flat&labelColor=1a1b26)](https://opensource.org/licenses/Apache-2.0)
50
  [![Base Model](https://img.shields.io/badge/Base-Qwen3.6--27B-bb9af7?style=flat&labelColor=1a1b26)](https://huggingface.co/Qwen/Qwen3.6-27B)
51
  [![Architecture](https://img.shields.io/badge/Arch-Dense_27B-ff9e64?style=flat&labelColor=1a1b26)](#architecture)
52
  [![Sibling](https://img.shields.io/badge/Sibling-Janus--35B-7dcfff?style=flat&labelColor=1a1b26)](https://huggingface.co/FoolDev/janus)
53
+ [![Buy me a coffee](https://img.shields.io/badge/Buy_me_a_coffee-e0af68?style=flat&labelColor=1a1b26&logo=buymeacoffee&logoColor=1a1b26)](https://buymeacoffee.com/Thanatos-27B)
54
 
55
+ # Thanatos-27B
56
 
57
  > **Dense Reasoning. Friendlier Footprint.**
58
  > *Qwen 3.6 27B (dense) repackaged with Claude Opus 4.7 in the teacher slot.*
 
69
  `Modelfile`):
70
 
71
  ```bash
72
+ ollama run hf.co/FoolDev/thanatos-27b # ~17 GB Q4_K_M (the only bundled quant)
73
  ```
74
 
75
  For other quants (Q3_K_S ~12 GB, Q5_K_M ~20 GB, etc.), `make build
 
80
  three bridge files) for any quant:
81
 
82
  ```bash
83
+ git clone https://huggingface.co/FoolDev/thanatos-27b && cd thanatos-27b
84
+ make build # uses the bundled Thanatos-27B.Q4_K_M.gguf
85
  make build QUANT=Q5_K_M # downloads from unsloth/Qwen3.6-27B-GGUF
86
+ ollama run thanatos-27b
87
  ```
88
 
89
  For image input use llama.cpp directly — Ollama vision is broken for
 
95
 
96
  The 27B is **dense**: every parameter participates in every forward pass. It's slower per token than 35B-A3B — on a Ryzen AI Max+ 395 / Radeon 8060S iGPU the dense 27B at Q3_K_S clocks ~10 tok/s, versus ~27 tok/s for the MoE 35B at ~Q4 (`make bench`, 3-prompt mix) — but the working set fits comfortably on commodity GPUs and avoids the MoE-specific load-balance failure modes.
97
 
98
+ | | Thanatos-27B (this) | [Janus-35B](https://huggingface.co/FoolDev/janus) |
99
  |---|---|---|
100
  | Architecture | Dense transformer | MoE 256 experts, 8 active |
101
  | Total params | 27 B | 35 B |
 
116
  |---|---|
117
  | `banner.svg` / `banner.png` | Repo header, Tokyo Night themed |
118
  | `Modelfile` | Ollama wrapper around the bundled Qwen 3.6 27B GGUF — used by `make build` / `ollama create` for **local** builds |
119
+ | `template`, `system`, `params` | Used by HF's Ollama bridge when users `ollama run hf.co/FoolDev/thanatos-27b` directly (the bridge does **not** read `Modelfile` — see [HF Ollama docs](https://huggingface.co/docs/hub/en/ollama)). Mirrors the `Modelfile`'s template / system prompt / sampling params. |
120
  | `examples/` | Ready-to-run Python clients for Ollama, Transformers, and llama-cpp-python |
121
  | `scripts/build.sh` | One-shot helper: pulls a GGUF and runs `ollama create` for you |
122
  | `scripts/smoke_test.sh` | Verifies an Ollama daemon + model, runs a round-trip, asserts no chat-template tokens leak into the response. With `TOOLS_TEST=1`, also exercises an end-to-end tool-call round-trip and checks the response shape |
 
131
  | `README.md` | This file |
132
 
133
  This repo ships two GGUFs to back the HF/Ollama "Use this model"
134
+ widget — `Thanatos-27B.Q4_K_M.gguf` (~17 GB):
135
 
136
  ```bash
137
+ ollama run hf.co/FoolDev/thanatos-27b # 17 GB Q4_K_M (only bundled quant)
138
  ```
139
 
140
  For 16 GB GPUs / unified-memory laptops, `make build QUANT=Q3_K_S`
141
  downloads the smaller ~12 GB Q3_K_S quant from `unsloth/Qwen3.6-27B-GGUF`
142
+ and creates a local `thanatos-27b` Ollama tag (does not redistribute via
143
  this repo).
144
 
145
  For other quants or local builds, pull from
 
153
  ## Architecture
154
 
155
  <p align="left">
156
+ <img src="https://huggingface.co/FoolDev/thanatos-27b/resolve/main/dense-flow.svg" alt="animated dense forward-pass visualization: 64-layer hybrid attention stack with a pulse traversing left-to-right, illuminating Gated DeltaNet (purple) and Gated Attention (cyan) layers in turn" width="800" />
157
  </p>
158
 
159
  - Qwen 3.6 dense, 27B parameters, 64 transformer layers
 
178
  ```bash
179
  # A. Pull straight from HF (uses the bundled Q4_K_M + root-level
180
  # template / system / params files):
181
+ ollama run hf.co/FoolDev/thanatos-27b # 17 GB Q4_K_M (only bundled quant)
182
 
183
  # B. Build locally for a different quant (downloads from unsloth):
184
+ make build # Q4_K_M -> thanatos-27b
185
  make build QUANT=Q3_K_S # 12 GB smaller quant
186
  make build QUANT=Q5_K_M # 20 GB higher quality
187
  make build GGUF_PATH=~/models/Qwen3.6-27B-Q4_K_M.gguf # skip download
188
+ ollama run thanatos-27b
189
  ```
190
 
191
  Under the hood, `make build` calls `scripts/build.sh`, which downloads the
 
193
  runs `ollama create` with the matching `Modelfile`.
194
 
195
  If you'd rather do it by hand: edit the `FROM` line in `Modelfile` and
196
+ run `ollama create thanatos-27b -f Modelfile && ollama run thanatos-27b`.
197
 
198
  Confirm everything works:
199
 
 
206
 
207
  ### Local apps
208
 
209
+ The bundled `Thanatos-27B.Q4_K_M.gguf` works in any GGUF-compatible local
210
  app — point it at this repo and load.
211
 
212
  | App | How to load this model |
213
  |---|---|
214
+ | **Ollama** | `ollama run hf.co/FoolDev/thanatos-27b` (default Q4_K_M). Pulls the GGUF + the root-level `template` / `system` / `params` files in one step (HF's Ollama bridge ingests these three files; it does **not** read `Modelfile`). For other quants, `make build QUANT=Q3_K_S` downloads from unsloth and creates a local Ollama tag using the `Modelfile`, which is kept in sync with the bridge files. |
215
+ | **LM Studio** | Search → `FoolDev/thanatos-27b` → pick `Thanatos-27B.Q4_K_M.gguf`. Uses the GGUF's embedded jinja chat template (Qwen 3.6 ChatML); set the system prompt manually from the `SYSTEM` block in this repo's `Modelfile`. |
216
+ | **Jan** | Hub → "Import from Hugging Face" → `FoolDev/thanatos-27b`. Same template behavior as LM Studio. |
217
+ | **llama.cpp** | `hf download FoolDev/thanatos-27b Thanatos-27B.Q4_K_M.gguf --local-dir .` then `llama-server -m Thanatos-27B.Q4_K_M.gguf` (or `llama-cli`, `llama-mtmd-cli` for vision via the upstream `mmproj-F16.gguf`). |
218
  | **llama-cpp-python** | See `examples/llama_cpp_quickstart.py` (text) and `examples/llama_cpp_vision.py` (image input). |
219
  | **Open WebUI / KoboldCpp / text-generation-webui** | Standard llama.cpp loader path — point at the GGUF, use the embedded chat template. |
220
 
 
232
  curl -s http://localhost:11434/v1/chat/completions \
233
  -H 'Content-Type: application/json' \
234
  -d '{
235
+ "model": "thanatos-27b",
236
  "messages": [
237
+ {"role": "system", "content": "You are Thanatos, a precise reasoning assistant."},
238
  {"role": "user", "content": "Explain the Burrows-Wheeler transform in 200 words."}
239
  ],
240
  "temperature": 0.6
 
256
  in your client:
257
 
258
  ```text
259
+ You are Thanatos, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
260
 
261
  Behavior rules:
262
  - Answer the user's actual request directly.
 
314
 
315
  ## Hardware requirements
316
 
317
+ The dense 27B is the lighter sibling to Janus-35B and the easier of the two to deploy.
318
 
319
  | Hardware | Status |
320
  |---|---|
 
345
  `.Tools` / `.ToolCalls` blocks Ollama's capability detector requires.
346
  Two paths fix this, depending on how you pull the model:
347
 
348
+ - **`ollama run hf.co/FoolDev/thanatos-27b`** — HF's Ollama bridge applies
349
  the root-level `template` / `system` / `params` files in this repo
350
  (the bridge does **not** read `Modelfile`).
351
+ - **`make build` / `ollama create thanatos-27b -f Modelfile`** — uses the
352
  `Modelfile`'s `TEMPLATE` block.
353
 
354
  Both routes wire `.Tools` / `.ToolCalls` and tools work end-to-end on
 
359
 
360
  ```text
361
  <|im_start|>system
362
+ You are Thanatos, a precise and capable assistant…<|im_end|>
363
  <|im_start|>user
364
  What is the time complexity of mergesort?<|im_end|>
365
  <|im_start|>assistant
 
391
  **Ollama path** (this repo's `Modelfile`). The `TEMPLATE` directive
392
  prompts the model to emit JSON-in-XML, the form Ollama's tool-call
393
  extractor parses into a structured `tool_calls` array. After
394
+ `make build`, `ollama show thanatos-27b` lists `tools` and `thinking`
395
  under **Capabilities**, and both `/api/chat` and `/v1/chat/completions`
396
  accept a `tools` array.
397
 
Janus-27B.Q4_K_M.gguf → Thanatos-27B.Q4_K_M.gguf RENAMED
File without changes
banner.png CHANGED
banner.svg CHANGED
examples/README.md CHANGED
@@ -1,15 +1,15 @@
1
- # Janus-27B examples
2
 
3
  Four minimal entry points. Pick the one that matches how you run models.
4
 
5
  | File | Backend | When to use |
6
  |---|---|---|
7
- | `ollama_chat.py` | Ollama HTTP API | You already have `ollama serve` running and the `janus-27b` model created from the project `Modelfile`. **Text + tool calling** — vision via Ollama is broken upstream for this arch. |
8
  | `transformers_quickstart.py` | Hugging Face Transformers | You want to run the upstream safetensors (`Qwen/Qwen3.6-27B`) on GPU, optionally in 4-bit via bitsandbytes. |
9
  | `llama_cpp_quickstart.py` | llama-cpp-python | You want to invoke a local GGUF directly without a daemon (CI, batch jobs, scripts). Text only. |
10
  | `llama_cpp_vision.py` | llama-cpp-python + mmproj | **Image input.** Loads a text GGUF + `mmproj-F16.gguf` and answers questions about an image. The only working vision path right now. |
11
 
12
- All three apply the same Janus system prompt and sampling defaults
13
  (`temp=0.6, top_p=0.95, top_k=20, repeat_penalty=1.05`) so behavior should
14
  be consistent across backends modulo quantization noise.
15
 
@@ -21,15 +21,15 @@ Easiest path — pull straight from HF (gets the bundled Q4_K_M GGUF +
21
  this repo's Modelfile in one step):
22
 
23
  ```bash
24
- ollama pull hf.co/FoolDev/janus-27b # 17 GB Q4_K_M (only bundled quant)
25
  pip install requests
26
- MODEL=hf.co/FoolDev/janus-27b python ollama_chat.py
27
  ```
28
 
29
  For the smaller-footprint Q3_K_S (~12 GB) or other quants, build
30
  locally instead — see the parent repo's `make build QUANT=...` flow.
31
 
32
- Or build locally from this repo (uses the bundled `Janus-27B.Q4_K_M.gguf`,
33
  no edits required):
34
 
35
  ```bash
 
1
+ # Thanatos-27B examples
2
 
3
  Four minimal entry points. Pick the one that matches how you run models.
4
 
5
  | File | Backend | When to use |
6
  |---|---|---|
7
+ | `ollama_chat.py` | Ollama HTTP API | You already have `ollama serve` running and the `thanatos-27b` model created from the project `Modelfile`. **Text + tool calling** — vision via Ollama is broken upstream for this arch. |
8
  | `transformers_quickstart.py` | Hugging Face Transformers | You want to run the upstream safetensors (`Qwen/Qwen3.6-27B`) on GPU, optionally in 4-bit via bitsandbytes. |
9
  | `llama_cpp_quickstart.py` | llama-cpp-python | You want to invoke a local GGUF directly without a daemon (CI, batch jobs, scripts). Text only. |
10
  | `llama_cpp_vision.py` | llama-cpp-python + mmproj | **Image input.** Loads a text GGUF + `mmproj-F16.gguf` and answers questions about an image. The only working vision path right now. |
11
 
12
+ All three apply the same Thanatos system prompt and sampling defaults
13
  (`temp=0.6, top_p=0.95, top_k=20, repeat_penalty=1.05`) so behavior should
14
  be consistent across backends modulo quantization noise.
15
 
 
21
  this repo's Modelfile in one step):
22
 
23
  ```bash
24
+ ollama pull hf.co/FoolDev/thanatos-27b # 17 GB Q4_K_M (only bundled quant)
25
  pip install requests
26
+ MODEL=hf.co/FoolDev/thanatos-27b python ollama_chat.py
27
  ```
28
 
29
  For the smaller-footprint Q3_K_S (~12 GB) or other quants, build
30
  locally instead — see the parent repo's `make build QUANT=...` flow.
31
 
32
+ Or build locally from this repo (uses the bundled `Thanatos-27B.Q4_K_M.gguf`,
33
  no edits required):
34
 
35
  ```bash
examples/llama_cpp_quickstart.py CHANGED
@@ -1,6 +1,6 @@
1
  #!/usr/bin/env python3
2
  """
3
- Janus-27B — llama-cpp-python quickstart.
4
 
5
  Skip Ollama entirely and call the GGUF directly through llama-cpp-python.
6
  Useful for batch jobs, CI, or environments where you don't want a daemon.
@@ -29,8 +29,8 @@ except ImportError: # pragma: no cover
29
  sys.exit("Missing llama-cpp-python. Install with: pip install llama-cpp-python")
30
 
31
 
32
- JANUS_SYSTEM = (
33
- "You are Janus, a precise and capable assistant for reasoning, writing, "
34
  "coding, and long-form dialogue.\n\n"
35
  "Behavior rules:\n"
36
  "- Answer the user's actual request directly.\n"
@@ -68,7 +68,7 @@ def main() -> None:
68
 
69
  out = llm.create_chat_completion(
70
  messages=[
71
- {"role": "system", "content": JANUS_SYSTEM},
72
  {"role": "user", "content": args.prompt},
73
  ],
74
  temperature=0.6,
 
1
  #!/usr/bin/env python3
2
  """
3
+ Thanatos-27B — llama-cpp-python quickstart.
4
 
5
  Skip Ollama entirely and call the GGUF directly through llama-cpp-python.
6
  Useful for batch jobs, CI, or environments where you don't want a daemon.
 
29
  sys.exit("Missing llama-cpp-python. Install with: pip install llama-cpp-python")
30
 
31
 
32
+ THANATOS_SYSTEM = (
33
+ "You are Thanatos, a precise and capable assistant for reasoning, writing, "
34
  "coding, and long-form dialogue.\n\n"
35
  "Behavior rules:\n"
36
  "- Answer the user's actual request directly.\n"
 
68
 
69
  out = llm.create_chat_completion(
70
  messages=[
71
+ {"role": "system", "content": THANATOS_SYSTEM},
72
  {"role": "user", "content": args.prompt},
73
  ],
74
  temperature=0.6,
examples/llama_cpp_vision.py CHANGED
@@ -1,6 +1,6 @@
1
  #!/usr/bin/env python3
2
  """
3
- Janus-27B — vision (image-text-to-text) via llama-cpp-python.
4
 
5
  Why this script exists:
6
  Ollama 0.22's vendored llama.cpp fork is missing the qwen35/qwen35moe
@@ -56,8 +56,8 @@ except ImportError: # pragma: no cover
56
  )
57
 
58
 
59
- JANUS_SYSTEM = (
60
- "You are Janus, a precise vision-language assistant. Describe images "
61
  "accurately, do not invent details, and ground every claim in the "
62
  "pixels you can actually see."
63
  )
@@ -104,7 +104,7 @@ def main() -> None:
104
 
105
  out = llm.create_chat_completion(
106
  messages=[
107
- {"role": "system", "content": JANUS_SYSTEM},
108
  {
109
  "role": "user",
110
  "content": [
 
1
  #!/usr/bin/env python3
2
  """
3
+ Thanatos-27B — vision (image-text-to-text) via llama-cpp-python.
4
 
5
  Why this script exists:
6
  Ollama 0.22's vendored llama.cpp fork is missing the qwen35/qwen35moe
 
56
  )
57
 
58
 
59
+ THANATOS_SYSTEM = (
60
+ "You are Thanatos, a precise vision-language assistant. Describe images "
61
  "accurately, do not invent details, and ground every claim in the "
62
  "pixels you can actually see."
63
  )
 
104
 
105
  out = llm.create_chat_completion(
106
  messages=[
107
+ {"role": "system", "content": THANATOS_SYSTEM},
108
  {
109
  "role": "user",
110
  "content": [
examples/ollama_chat.py CHANGED
@@ -1,17 +1,17 @@
1
  #!/usr/bin/env python3
2
  """
3
- Janus-27B — Ollama chat examples.
4
 
5
  Prerequisites (pick one):
6
 
7
  A. From the bundled GGUFs (default flow):
8
- $ make build # uses Janus-27B.Q4_K_M.gguf
9
  # or:
10
- $ ollama create janus-27b -f ../Modelfile
11
 
12
  B. Pull straight from HF (Q4_K_M is the only bundled quant):
13
- $ ollama run hf.co/FoolDev/janus-27b
14
- # then set MODEL=hf.co/FoolDev/janus-27b below
15
 
16
  Then:
17
  $ ollama serve # usually already running
@@ -36,7 +36,7 @@ from typing import Any, Iterator
36
 
37
  import requests
38
 
39
- MODEL = os.environ.get("MODEL", "janus-27b")
40
  HOST = os.environ.get("HOST", "http://localhost:11434")
41
 
42
  _THINK_RE = re.compile(r"<think>.*?</think>\s*", re.DOTALL)
 
1
  #!/usr/bin/env python3
2
  """
3
+ Thanatos-27B — Ollama chat examples.
4
 
5
  Prerequisites (pick one):
6
 
7
  A. From the bundled GGUFs (default flow):
8
+ $ make build # uses Thanatos-27B.Q4_K_M.gguf
9
  # or:
10
+ $ ollama create thanatos-27b -f ../Modelfile
11
 
12
  B. Pull straight from HF (Q4_K_M is the only bundled quant):
13
+ $ ollama run hf.co/FoolDev/thanatos-27b
14
+ # then set MODEL=hf.co/FoolDev/thanatos-27b below
15
 
16
  Then:
17
  $ ollama serve # usually already running
 
36
 
37
  import requests
38
 
39
+ MODEL = os.environ.get("MODEL", "thanatos-27b")
40
  HOST = os.environ.get("HOST", "http://localhost:11434")
41
 
42
  _THINK_RE = re.compile(r"<think>.*?</think>\s*", re.DOTALL)
examples/transformers_quickstart.py CHANGED
@@ -1,9 +1,9 @@
1
  #!/usr/bin/env python3
2
  """
3
- Janus-27B — Hugging Face Transformers quickstart.
4
 
5
  Loads the upstream Qwen 3.6 27B safetensors directly and runs a single
6
- chat turn using its embedded chat template. Janus-27B is a *wrapper*
7
  around that base, so for the transformers route there is nothing to
8
  download from this repo — point at Qwen/Qwen3.6-27B and apply the same
9
  system prompt the Modelfile uses.
@@ -38,8 +38,8 @@ except ImportError as e: # pragma: no cover
38
 
39
  MODEL_ID = "Qwen/Qwen3.6-27B"
40
 
41
- JANUS_SYSTEM = (
42
- "You are Janus, a precise and capable assistant for reasoning, writing, "
43
  "coding, and long-form dialogue.\n\n"
44
  "Behavior rules:\n"
45
  "- Answer the user's actual request directly.\n"
@@ -75,7 +75,7 @@ def load(use_4bit: bool):
75
 
76
  def generate(tok, model, prompt: str, max_new_tokens: int = 512) -> str:
77
  messages = [
78
- {"role": "system", "content": JANUS_SYSTEM},
79
  {"role": "user", "content": prompt},
80
  ]
81
  inputs = tok.apply_chat_template(
 
1
  #!/usr/bin/env python3
2
  """
3
+ Thanatos-27B — Hugging Face Transformers quickstart.
4
 
5
  Loads the upstream Qwen 3.6 27B safetensors directly and runs a single
6
+ chat turn using its embedded chat template. Thanatos-27B is a *wrapper*
7
  around that base, so for the transformers route there is nothing to
8
  download from this repo — point at Qwen/Qwen3.6-27B and apply the same
9
  system prompt the Modelfile uses.
 
38
 
39
  MODEL_ID = "Qwen/Qwen3.6-27B"
40
 
41
+ THANATOS_SYSTEM = (
42
+ "You are Thanatos, a precise and capable assistant for reasoning, writing, "
43
  "coding, and long-form dialogue.\n\n"
44
  "Behavior rules:\n"
45
  "- Answer the user's actual request directly.\n"
 
75
 
76
  def generate(tok, model, prompt: str, max_new_tokens: int = 512) -> str:
77
  messages = [
78
+ {"role": "system", "content": THANATOS_SYSTEM},
79
  {"role": "user", "content": prompt},
80
  ]
81
  inputs = tok.apply_chat_template(
scripts/bench.sh CHANGED
@@ -1,5 +1,5 @@
1
  #!/usr/bin/env bash
2
- # Janus-27B — tok/s benchmark via Ollama.
3
  #
4
  # Reads timing from Ollama's /api/chat response metadata (eval_count and
5
  # eval_duration are authoritative — no client-side stopwatch noise) and
@@ -7,14 +7,14 @@
7
  # number generalises a bit beyond a single shape.
8
  #
9
  # Usage:
10
- # ./scripts/bench.sh # uses MODEL=janus-27b
11
- # MODEL=janus-27b ./scripts/bench.sh
12
  # HOST=http://localhost:11434 ./scripts/bench.sh
13
  #
14
  # Requires: curl, jq, a running Ollama daemon with the model created.
15
  set -euo pipefail
16
 
17
- MODEL="${MODEL:-janus-27b}"
18
  HOST="${HOST:-http://localhost:11434}"
19
 
20
  red() { printf "\033[31m%s\033[0m\n" "$*" >&2; }
 
1
  #!/usr/bin/env bash
2
+ # Thanatos-27B — tok/s benchmark via Ollama.
3
  #
4
  # Reads timing from Ollama's /api/chat response metadata (eval_count and
5
  # eval_duration are authoritative — no client-side stopwatch noise) and
 
7
  # number generalises a bit beyond a single shape.
8
  #
9
  # Usage:
10
+ # ./scripts/bench.sh # uses MODEL=thanatos-27b
11
+ # MODEL=thanatos-27b ./scripts/bench.sh
12
  # HOST=http://localhost:11434 ./scripts/bench.sh
13
  #
14
  # Requires: curl, jq, a running Ollama daemon with the model created.
15
  set -euo pipefail
16
 
17
+ MODEL="${MODEL:-thanatos-27b}"
18
  HOST="${HOST:-http://localhost:11434}"
19
 
20
  red() { printf "\033[31m%s\033[0m\n" "$*" >&2; }
scripts/build.sh CHANGED
@@ -1,5 +1,5 @@
1
  #!/usr/bin/env bash
2
- # Janus-27B — fetch a Qwen 3.6 27B GGUF and build the Ollama model.
3
  #
4
  # Usage:
5
  # ./scripts/build.sh # default: Q4_K_M
@@ -28,7 +28,7 @@ ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
28
  GGUF_PATH="${GGUF_PATH:-${ROOT}/${GGUF_NAME}}"
29
 
30
  MODELFILE="${ROOT}/Modelfile"
31
- TAG="${TAG:-janus-27b}"
32
 
33
  echo "[*] repo: ${REPO_ID}"
34
  echo "[*] quant: ${QUANT}"
@@ -81,7 +81,7 @@ fi
81
 
82
  # ---- 3. Patch the Modelfile FROM line in a temp copy -------------------------
83
 
84
- TMP_MODELFILE="$(mktemp -t janus27b-modelfile.XXXXXX)"
85
  trap 'rm -f "${TMP_MODELFILE}"' EXIT
86
  awk -v p="${GGUF_PATH}" '
87
  /^FROM[[:space:]]/ && !done { print "FROM " p; done=1; next }
@@ -96,4 +96,4 @@ ollama create "${TAG}" -f "${TMP_MODELFILE}"
96
  echo
97
  echo "[+] Done. Try it:"
98
  echo " ollama run ${TAG}"
99
- echo " python ${ROOT}/examples/ollama_chat.py # update MODEL constant if not 'janus-27b'"
 
1
  #!/usr/bin/env bash
2
+ # Thanatos-27B — fetch a Qwen 3.6 27B GGUF and build the Ollama model.
3
  #
4
  # Usage:
5
  # ./scripts/build.sh # default: Q4_K_M
 
28
  GGUF_PATH="${GGUF_PATH:-${ROOT}/${GGUF_NAME}}"
29
 
30
  MODELFILE="${ROOT}/Modelfile"
31
+ TAG="${TAG:-thanatos-27b}"
32
 
33
  echo "[*] repo: ${REPO_ID}"
34
  echo "[*] quant: ${QUANT}"
 
81
 
82
  # ---- 3. Patch the Modelfile FROM line in a temp copy -------------------------
83
 
84
+ TMP_MODELFILE="$(mktemp -t thanatos27b-modelfile.XXXXXX)"
85
  trap 'rm -f "${TMP_MODELFILE}"' EXIT
86
  awk -v p="${GGUF_PATH}" '
87
  /^FROM[[:space:]]/ && !done { print "FROM " p; done=1; next }
 
96
  echo
97
  echo "[+] Done. Try it:"
98
  echo " ollama run ${TAG}"
99
+ echo " python ${ROOT}/examples/ollama_chat.py # update MODEL constant if not 'thanatos-27b'"
scripts/check.sh CHANGED
@@ -1,5 +1,5 @@
1
  #!/usr/bin/env bash
2
- # Janus-27B — repo-local sanity checks.
3
  #
4
  # Runs everything that's cheap and catches a real-world bug we've already hit:
5
  #
 
1
  #!/usr/bin/env bash
2
+ # Thanatos-27B — repo-local sanity checks.
3
  #
4
  # Runs everything that's cheap and catches a real-world bug we've already hit:
5
  #
scripts/check_bridge_sync.py CHANGED
@@ -1,13 +1,13 @@
1
  #!/usr/bin/env python3
2
  """
3
- Janus-27B — verify Modelfile and HF Ollama bridge files stay in sync.
4
 
5
  The repo ships two parallel Ollama configurations:
6
 
7
  - ``Modelfile`` is consumed by the local-build path (``ollama create -f Modelfile``).
8
  It contains ``TEMPLATE`` / ``SYSTEM`` / ``PARAMETER`` directives.
9
  - ``template`` / ``system`` / ``params`` at the repo root are consumed by HF's
10
- Ollama bridge when users ``ollama run hf.co/FoolDev/janus-27b`` directly. HF
11
  does NOT read the Modelfile (per https://huggingface.co/docs/hub/en/ollama).
12
 
13
  If the two configurations drift apart, ``hf.co/...`` users and ``make build``
 
1
  #!/usr/bin/env python3
2
  """
3
+ Thanatos-27B — verify Modelfile and HF Ollama bridge files stay in sync.
4
 
5
  The repo ships two parallel Ollama configurations:
6
 
7
  - ``Modelfile`` is consumed by the local-build path (``ollama create -f Modelfile``).
8
  It contains ``TEMPLATE`` / ``SYSTEM`` / ``PARAMETER`` directives.
9
  - ``template`` / ``system`` / ``params`` at the repo root are consumed by HF's
10
+ Ollama bridge when users ``ollama run hf.co/FoolDev/thanatos-27b`` directly. HF
11
  does NOT read the Modelfile (per https://huggingface.co/docs/hub/en/ollama).
12
 
13
  If the two configurations drift apart, ``hf.co/...`` users and ``make build``
scripts/fetch_vision.sh CHANGED
@@ -1,5 +1,5 @@
1
  #!/usr/bin/env bash
2
- # Janus-27B — fetch the vision projector (mmproj) for image input.
3
  #
4
  # Why this is separate from build.sh:
5
  # build.sh is for the Ollama text path. The mmproj is only useful for
 
1
  #!/usr/bin/env bash
2
+ # Thanatos-27B — fetch the vision projector (mmproj) for image input.
3
  #
4
  # Why this is separate from build.sh:
5
  # build.sh is for the Ollama text path. The mmproj is only useful for
scripts/install-hooks.sh CHANGED
@@ -1,5 +1,5 @@
1
  #!/usr/bin/env bash
2
- # Janus-27B — install scripts/check.sh as a git pre-commit hook.
3
  #
4
  # Idempotent. Re-runs are safe.
5
  set -euo pipefail
 
1
  #!/usr/bin/env bash
2
+ # Thanatos-27B — install scripts/check.sh as a git pre-commit hook.
3
  #
4
  # Idempotent. Re-runs are safe.
5
  set -euo pipefail
scripts/smoke_test.sh CHANGED
@@ -1,5 +1,5 @@
1
  #!/usr/bin/env bash
2
- # Janus-27B — smoke test against a running Ollama daemon.
3
  #
4
  # Verifies:
5
  # 1. The Ollama server is reachable.
@@ -14,11 +14,11 @@
14
  # Usage:
15
  # ./scripts/smoke_test.sh # fast checks only
16
  # TOOLS_TEST=1 ./scripts/smoke_test.sh # add tool-call round-trip
17
- # MODEL=hf.co/FoolDev/janus-27b:Q4_K_M ./scripts/smoke_test.sh
18
  # HOST=http://localhost:11434 ./scripts/smoke_test.sh
19
  set -euo pipefail
20
 
21
- MODEL="${MODEL:-janus-27b}"
22
  HOST="${HOST:-http://localhost:11434}"
23
  PROMPT="${PROMPT:-Reply with the single word: OK}"
24
 
 
1
  #!/usr/bin/env bash
2
+ # Thanatos-27B — smoke test against a running Ollama daemon.
3
  #
4
  # Verifies:
5
  # 1. The Ollama server is reachable.
 
14
  # Usage:
15
  # ./scripts/smoke_test.sh # fast checks only
16
  # TOOLS_TEST=1 ./scripts/smoke_test.sh # add tool-call round-trip
17
+ # MODEL=hf.co/FoolDev/thanatos-27b:Q4_K_M ./scripts/smoke_test.sh
18
  # HOST=http://localhost:11434 ./scripts/smoke_test.sh
19
  set -euo pipefail
20
 
21
+ MODEL="${MODEL:-thanatos-27b}"
22
  HOST="${HOST:-http://localhost:11434}"
23
  PROMPT="${PROMPT:-Reply with the single word: OK}"
24
 
system CHANGED
@@ -1,4 +1,4 @@
1
- You are Janus, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
2
 
3
  Behavior rules:
4
  - Answer the user's actual request directly.
 
1
+ You are Thanatos, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
2
 
3
  Behavior rules:
4
  - Answer the user's actual request directly.