Instructions to use FoolDev/Thanatos-27B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FoolDev/Thanatos-27B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="FoolDev/Thanatos-27B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("FoolDev/Thanatos-27B", dtype="auto")

llama-cpp-python

How to use FoolDev/Thanatos-27B with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="FoolDev/Thanatos-27B",
	filename="Thanatos-27B.Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": [
				{
					"type": "text",
					"text": "Describe this image in one sentence."
				},
				{
					"type": "image_url",
					"image_url": {
						"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
					}
				}
			]
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use FoolDev/Thanatos-27B with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf FoolDev/Thanatos-27B:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf FoolDev/Thanatos-27B:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf FoolDev/Thanatos-27B:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf FoolDev/Thanatos-27B:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf FoolDev/Thanatos-27B:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf FoolDev/Thanatos-27B:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf FoolDev/Thanatos-27B:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf FoolDev/Thanatos-27B:Q4_K_M

Use Docker

docker model run hf.co/FoolDev/Thanatos-27B:Q4_K_M

LM Studio
Jan

vLLM

How to use FoolDev/Thanatos-27B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FoolDev/Thanatos-27B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FoolDev/Thanatos-27B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/FoolDev/Thanatos-27B:Q4_K_M

SGLang

How to use FoolDev/Thanatos-27B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FoolDev/Thanatos-27B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FoolDev/Thanatos-27B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FoolDev/Thanatos-27B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FoolDev/Thanatos-27B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Ollama
How to use FoolDev/Thanatos-27B with Ollama:
```
ollama run hf.co/FoolDev/Thanatos-27B:Q4_K_M
```

Unsloth Studio

How to use FoolDev/Thanatos-27B with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FoolDev/Thanatos-27B to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FoolDev/Thanatos-27B to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for FoolDev/Thanatos-27B to start chatting

How to use FoolDev/Thanatos-27B with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf FoolDev/Thanatos-27B:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "FoolDev/Thanatos-27B:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use FoolDev/Thanatos-27B with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf FoolDev/Thanatos-27B:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default FoolDev/Thanatos-27B:Q4_K_M

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use FoolDev/Thanatos-27B with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf FoolDev/Thanatos-27B:Q4_K_M

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "FoolDev/Thanatos-27B:Q4_K_M" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use FoolDev/Thanatos-27B with Docker Model Runner:
```
docker model run hf.co/FoolDev/Thanatos-27B:Q4_K_M
```

Lemonade

How to use FoolDev/Thanatos-27B with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull FoolDev/Thanatos-27B:Q4_K_M

Run and chat with the model

lemonade run user.Thanatos-27B-Q4_K_M

List all available models

lemonade list

FoolDev commited on May 2

Commit

9ca8700

1 Parent(s): 7766f0b

Add Makefile, GGUF_PATH override, and harden .gitignore

Browse files

Makefile: thin wrapper over scripts/*. 'make help' lists targets
(build / smoke / check / hooks / clean). All variables (QUANT, PROFILE,
TAG, GGUF_PATH, MODEL) are overridable on the command line.

scripts/build.sh: honor GGUF_PATH so users with weights already on disk
(shared model dirs, NAS mounts, an Ollama blob copied out of
~/.ollama/models/blobs) don't have to copy or symlink. Also defers the
HF-CLI requirement until a download is actually needed.

.gitignore: exclude .cache/, *.incomplete, *.lock — running 'hf download'
inside the repo creates a .cache/huggingface/ tree, and a previous
session committed an empty version of it. Belt-and-suspenders.

README + CHANGELOG updated to reflect the new flow.

Files changed (5) hide show

.gitignore +7 -0
CHANGELOG.md +17 -1
Makefile +61 -0
README.md +10 -10
scripts/build.sh +30 -20

.gitignore CHANGED Viewed

@@ -10,8 +10,15 @@ venv/
 *.safetensors
 *.bin
 # Editor / OS
 .DS_Store
 .idea/
 .vscode/
 *.swp

 *.safetensors
 *.bin
+# Build / runtime artifacts that get created if anyone runs hf download or
+# scripts/build.sh from inside the repo.
+.cache/
+*.incomplete
+*.lock
 # Editor / OS
 .DS_Store
 .idea/
 .vscode/
 *.swp
+*~

CHANGELOG.md CHANGED Viewed

@@ -7,13 +7,29 @@ and documentation**, not the underlying base model.
 ## [Unreleased]
 ### Added
 - `scripts/check.sh` — local lint runner: `bash -n`, optional `shellcheck`,
   `pyflakes`, `py_compile`, plus a guard regex for the dash/dot filename
   bug. Returns non-zero on failure.
 - `scripts/install-hooks.sh` — installs `check.sh` as a git pre-commit
   hook so footguns can't slip past again.
-- `CHANGELOG.md` (this file).
 ## [0.2.1] - 2026-05-02 — `82677d0`

 ## [Unreleased]
+### Added
+- `Makefile` — convenience wrapper. `make help` lists targets:
+  `build` / `smoke` / `check` / `hooks` / `clean`. Variables
+  `QUANT`, `PROFILE`, `TAG`, `GGUF_PATH`, `MODEL` are overridable.
+### Changed
+- `scripts/build.sh` now honors `GGUF_PATH` so users with weights already
+  on disk (e.g. shared model dirs, NAS mounts) don't have to download or
+  symlink. Also defers the HF-CLI requirement until a download is
+  actually needed.
+- `.gitignore` now excludes `.cache/`, `*.incomplete`, and `*.lock` so
+  ephemeral artifacts from running `hf download` inside the repo never
+  get committed.
+## [0.3.0] - 2026-05-02 — `7766f0b`
 ### Added
 - `scripts/check.sh` — local lint runner: `bash -n`, optional `shellcheck`,
   `pyflakes`, `py_compile`, plus a guard regex for the dash/dot filename
   bug. Returns non-zero on failure.
 - `scripts/install-hooks.sh` — installs `check.sh` as a git pre-commit
   hook so footguns can't slip past again.
+- `CHANGELOG.md`.
 ## [0.2.1] - 2026-05-02 — `82677d0`

Makefile ADDED Viewed

	@@ -0,0 +1,61 @@

+# Janus-27B convenience Makefile.
+#
+# All work is delegated to scripts/* — this file just gives common
+# operations short, discoverable names.
+#
+# Variables you can override on the command line:
+#   QUANT     GGUF quant suffix       (default: Q4_K_M)
+#   PROFILE   default | z13           (default: default)
+#   TAG       Ollama model tag        (auto: janus-27b or janus-27b-z13)
+#   GGUF_PATH path to existing GGUF   (skip the download)
+#   MODEL     model tag for smoke     (default: $(TAG))
+#
+# Examples:
+#   make build                          # default profile, Q4_K_M
+#   make build PROFILE=z13 QUANT=Q3_K_S
+#   make build GGUF_PATH=~/models/Qwen3.6-27B-Q4_K_M.gguf
+#   make smoke
+#   make check
+#   make clean
+QUANT   ?= Q4_K_M
+PROFILE ?= default
+ifeq ($(PROFILE),z13)
+    TAG ?= janus-27b-z13
+else
+    TAG ?= janus-27b
+endif
+MODEL ?= $(TAG)
+.DEFAULT_GOAL := help
+.PHONY: help build smoke check hooks clean
+help:  ## Show this help.
+	@awk 'BEGIN {FS = ":.*##"; printf "Targets:\n"} /^[a-zA-Z_-]+:.*?##/ { printf "  \033[36m%-12s\033[0m %s\n", $$1, $$2 }' $(MAKEFILE_LIST)
+	@echo
+	@echo "Current settings:"
+	@echo "  QUANT=$(QUANT) PROFILE=$(PROFILE) TAG=$(TAG)"
+ifdef GGUF_PATH
+	@echo "  GGUF_PATH=$(GGUF_PATH)"
+endif
+build:  ## Download GGUF (if needed) and run 'ollama create'.
+	GGUF_PATH=$(GGUF_PATH) ./scripts/build.sh $(QUANT) $(PROFILE)
+smoke:  ## Verify the model is reachable and round-trips.
+	MODEL=$(MODEL) ./scripts/smoke_test.sh
+check:  ## Lint shell + python files; block dot-pattern footgun.
+	./scripts/check.sh
+hooks:  ## Install scripts/check.sh as the git pre-commit hook.
+	./scripts/install-hooks.sh
+clean:  ## Remove local GGUF copies and ephemeral caches in this repo.
+	@echo "[*] removing local GGUFs and ephemeral caches in $$PWD"
+	@rm -f ./Qwen3.6-27B-*.gguf
+	@rm -rf ./.cache __pycache__ examples/__pycache__
+	@echo "[+] clean"

README.md CHANGED Viewed

@@ -89,6 +89,7 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
 | `scripts/smoke_test.sh` | Verifies an Ollama daemon + model and runs a round-trip |
 | `scripts/check.sh` | Local lint: `bash -n`, `pyflakes`, `py_compile`, footgun-grep |
 | `scripts/install-hooks.sh` | Installs `check.sh` as a git pre-commit hook |
 | `LICENSE`, `CITATION.cff` | Apache-2.0 license and citation metadata |
 | `CHANGELOG.md` | Versioned tooling/docs changes |
 | `README.md` | This file |
@@ -113,25 +114,24 @@ If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-2
 ### Ollama (one-liner)
-`scripts/build.sh` will download the GGUF and create the Ollama model in one shot:
 ```bash
-./scripts/build.sh                  # Q4_K_M, default profile      -> janus-27b
-./scripts/build.sh Q3_K_S z13       # Z13 profile (Modelfile.z13)  -> janus-27b-z13
-./scripts/build.sh Q5_K_M           # higher-quality quant         -> janus-27b
 ollama run janus-27b
 ```
-Or do it manually if you already have a GGUF on disk — edit the `FROM` line in `Modelfile` and run:
-```bash
-ollama create janus-27b -f Modelfile && ollama run janus-27b
-```
 Confirm everything works:
 ```bash
-./scripts/smoke_test.sh             # checks server, model, round-trip
 python examples/ollama_chat.py      # full demo: chat, streaming, tools, OpenAI-compat
 ```

 | `scripts/smoke_test.sh` | Verifies an Ollama daemon + model and runs a round-trip |
 | `scripts/check.sh` | Local lint: `bash -n`, `pyflakes`, `py_compile`, footgun-grep |
 | `scripts/install-hooks.sh` | Installs `check.sh` as a git pre-commit hook |
+| `Makefile` | Convenience wrapper — `make help` lists targets |
 | `LICENSE`, `CITATION.cff` | Apache-2.0 license and citation metadata |
 | `CHANGELOG.md` | Versioned tooling/docs changes |
 | `README.md` | This file |
 ### Ollama (one-liner)
 ```bash
+make build                                  # Q4_K_M, default profile     -> janus-27b
+make build PROFILE=z13 QUANT=Q3_K_S         # Z13 profile (Modelfile.z13) -> janus-27b-z13
+make build GGUF_PATH=~/models/Qwen3.6-27B-Q4_K_M.gguf   # skip download
 ollama run janus-27b
 ```
+Under the hood, `make build` calls `scripts/build.sh`, which downloads the
+GGUF if missing (set `GGUF_PATH` to point at one you already have) and
+runs `ollama create` with the matching `Modelfile`.
+If you'd rather do it by hand: edit the `FROM` line in `Modelfile` and
+run `ollama create janus-27b -f Modelfile && ollama run janus-27b`.
 Confirm everything works:
 ```bash
+make smoke                          # checks server, model, round-trip
 python examples/ollama_chat.py      # full demo: chat, streaming, tools, OpenAI-compat
 ```

scripts/build.sh CHANGED Viewed

@@ -7,7 +7,10 @@
 #   ./scripts/build.sh Q3_K_S z13            # quant + Z13 profile (uses Modelfile.z13)
 #   QUANT=Q6_K PROFILE=default ./scripts/build.sh
 #
-# Requires: huggingface-cli (or hf), ollama, awk, sed.
 set -euo pipefail
 QUANT="${1:-${QUANT:-Q4_K_M}}"
@@ -22,7 +25,9 @@ REPO_ID="${REPO_ID:-unsloth/Qwen3.6-27B-GGUF}"
 #   UD-Q5_K_XL UD-Q6_K_XL UD-Q8_K_XL
 GGUF_NAME="Qwen3.6-27B-${QUANT}.gguf"
 ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
-GGUF_PATH="${ROOT}/${GGUF_NAME}"
 case "${PROFILE}" in
     default) MODELFILE="${ROOT}/Modelfile";        TAG="janus-27b" ;;
@@ -46,33 +51,38 @@ if [[ ! -f "${MODELFILE}" ]]; then
     echo "[!] Missing ${MODELFILE}" >&2; exit 1
 fi
-# ---- 2. Pick a HuggingFace CLI ----------------------------------------------
-HF=""
-if command -v hf >/dev/null 2>&1; then
-    HF="hf"
-elif command -v huggingface-cli >/dev/null 2>&1; then
-    HF="huggingface-cli"
-else
-    echo "[!] Neither 'hf' nor 'huggingface-cli' found." >&2
-    echo "    pip install -U huggingface_hub" >&2
-    exit 1
-fi
-# ---- 3. Download GGUF if missing --------------------------------------------
 if [[ -f "${GGUF_PATH}" ]]; then
-    echo "[=] GGUF already present, skipping download."
 else
     echo "[*] Downloading ${GGUF_NAME} from ${REPO_ID} ..."
     case "${HF}" in
-        hf)                 hf download "${REPO_ID}" "${GGUF_NAME}" --local-dir "${ROOT}" ;;
-        huggingface-cli)    huggingface-cli download "${REPO_ID}" "${GGUF_NAME}" --local-dir "${ROOT}" ;;
     esac
 fi
 if [[ ! -f "${GGUF_PATH}" ]]; then
-    echo "[!] Download failed: ${GGUF_PATH} not present." >&2; exit 1
 fi
 # ---- 4. Patch the Modelfile FROM line in a temp copy -------------------------

 #   ./scripts/build.sh Q3_K_S z13            # quant + Z13 profile (uses Modelfile.z13)
 #   QUANT=Q6_K PROFILE=default ./scripts/build.sh
 #
+# Skip the download by pointing at a GGUF you already have:
+#   GGUF_PATH=/path/to/Qwen3.6-27B-Q4_K_M.gguf ./scripts/build.sh Q4_K_M
+#
+# Requires: huggingface-cli (or hf), ollama, awk.
 set -euo pipefail
 QUANT="${1:-${QUANT:-Q4_K_M}}"
 #   UD-Q5_K_XL UD-Q6_K_XL UD-Q8_K_XL
 GGUF_NAME="Qwen3.6-27B-${QUANT}.gguf"
 ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+# GGUF_PATH defaults to ${ROOT}/${GGUF_NAME}, but can be overridden so users
+# with cached weights elsewhere don't have to copy or symlink anything.
+GGUF_PATH="${GGUF_PATH:-${ROOT}/${GGUF_NAME}}"
 case "${PROFILE}" in
     default) MODELFILE="${ROOT}/Modelfile";        TAG="janus-27b" ;;
     echo "[!] Missing ${MODELFILE}" >&2; exit 1
 fi
+# ---- 2. Download GGUF if missing --------------------------------------------
 if [[ -f "${GGUF_PATH}" ]]; then
+    echo "[=] GGUF already present at ${GGUF_PATH}, skipping download."
 else
+    # Need a HF CLI to fetch the file.
+    HF=""
+    if command -v hf >/dev/null 2>&1; then
+        HF="hf"
+    elif command -v huggingface-cli >/dev/null 2>&1; then
+        HF="huggingface-cli"
+    else
+        echo "[!] ${GGUF_PATH} not found, and neither 'hf' nor" >&2
+        echo "    'huggingface-cli' is installed to download it." >&2
+        echo "    Either:" >&2
+        echo "      pip install -U huggingface_hub" >&2
+        echo "    or set GGUF_PATH to an existing GGUF and rerun." >&2
+        exit 1
+    fi
     echo "[*] Downloading ${GGUF_NAME} from ${REPO_ID} ..."
+    DEST_DIR="$(dirname "${GGUF_PATH}")"
+    mkdir -p "${DEST_DIR}"
     case "${HF}" in
+        hf)              hf download "${REPO_ID}" "${GGUF_NAME}" --local-dir "${DEST_DIR}" ;;
+        huggingface-cli) huggingface-cli download "${REPO_ID}" "${GGUF_NAME}" --local-dir "${DEST_DIR}" ;;
     esac
 fi
 if [[ ! -f "${GGUF_PATH}" ]]; then
+    echo "[!] GGUF still not present at ${GGUF_PATH} after download attempt." >&2
+    exit 1
 fi
 # ---- 4. Patch the Modelfile FROM line in a temp copy -------------------------