Instructions to use FoolDev/Thanatos-27B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FoolDev/Thanatos-27B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="FoolDev/Thanatos-27B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("FoolDev/Thanatos-27B", dtype="auto") - llama-cpp-python
How to use FoolDev/Thanatos-27B with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="FoolDev/Thanatos-27B", filename="Thanatos-27B.Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use FoolDev/Thanatos-27B with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf FoolDev/Thanatos-27B:Q4_K_M # Run inference directly in the terminal: llama-cli -hf FoolDev/Thanatos-27B:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf FoolDev/Thanatos-27B:Q4_K_M # Run inference directly in the terminal: llama-cli -hf FoolDev/Thanatos-27B:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf FoolDev/Thanatos-27B:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf FoolDev/Thanatos-27B:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf FoolDev/Thanatos-27B:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf FoolDev/Thanatos-27B:Q4_K_M
Use Docker
docker model run hf.co/FoolDev/Thanatos-27B:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use FoolDev/Thanatos-27B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FoolDev/Thanatos-27B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FoolDev/Thanatos-27B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/FoolDev/Thanatos-27B:Q4_K_M
- SGLang
How to use FoolDev/Thanatos-27B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FoolDev/Thanatos-27B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FoolDev/Thanatos-27B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FoolDev/Thanatos-27B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FoolDev/Thanatos-27B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Ollama
How to use FoolDev/Thanatos-27B with Ollama:
ollama run hf.co/FoolDev/Thanatos-27B:Q4_K_M
- Unsloth Studio new
How to use FoolDev/Thanatos-27B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for FoolDev/Thanatos-27B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for FoolDev/Thanatos-27B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for FoolDev/Thanatos-27B to start chatting
- Pi new
How to use FoolDev/Thanatos-27B with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf FoolDev/Thanatos-27B:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "FoolDev/Thanatos-27B:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use FoolDev/Thanatos-27B with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf FoolDev/Thanatos-27B:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default FoolDev/Thanatos-27B:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use FoolDev/Thanatos-27B with Docker Model Runner:
docker model run hf.co/FoolDev/Thanatos-27B:Q4_K_M
- Lemonade
How to use FoolDev/Thanatos-27B with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull FoolDev/Thanatos-27B:Q4_K_M
Run and chat with the model
lemonade run user.Thanatos-27B-Q4_K_M
List all available models
lemonade list
Add HF Ollama bridge files (template/system/params) + fix mmproj filename collision
Browse filesThe HF Ollama bridge does NOT read Modelfile (per docs at
https://huggingface.co/docs/hub/en/ollama). When users do
'ollama run hf.co/FoolDev/janus-27b' the bridge generates a manifest
from three root-level files: 'template' (Ollama Go format), 'system'
(plain text), and 'params' (JSON). Without those files HF auto-converts
the GGUF's embedded jinja chat template to Ollama Go format, and that
conversion is buggy: produces '{{ if .Prompt }} .Prompt }}<|im_end|>'
(missing user-role wrapper, malformed value substitution), corrupted
stop tokens including the literal string '.Prompt }}<|im_end|>', and
no .Tools/.ToolCalls blocks — so 'ollama show hf.co/FoolDev/janus-27b'
reports only the 'completion' capability and rejects any /api/chat or
/v1/chat/completions request carrying a tools array.
Added template, system, params at repo root (mirrors the Modelfile's
TEMPLATE/SYSTEM/PARAMETER directives). Both routes now wire .Tools
and tool calling works end-to-end on either path.
Also renamed scripts/fetch_mmproj.sh -> scripts/fetch_vision.sh: HF's
Ollama bridge was filename-pattern-matching mmproj* anywhere in the
repo and shipping the 2028-byte bash script as the
application/vnd.ollama.image.projector layer. When Ollama tried to
load that 'projector' as a GGUF it failed the magic-bytes check —
'Error: invalid file magic' on every ollama show / ollama run.
Renaming breaks the pattern match; projector layer drops from the
manifest. Updated Makefile mmproj target and README references to
point at the new name.
README updated: 'What's here' table lists template/system/params;
'Quick start' / 'Local apps' / 'Chat template' sections corrected to
say HF bridge uses the three files (not Modelfile).
- CHANGELOG.md +28 -0
- Makefile +1 -1
- README.md +25 -14
- params +12 -0
- scripts/{fetch_mmproj.sh → fetch_vision.sh} +3 -3
- system +10 -0
- template +51 -0
|
@@ -7,6 +7,34 @@ and documentation**, not the underlying base model.
|
|
| 7 |
|
| 8 |
## [Unreleased]
|
| 9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
### Changed
|
| 11 |
- README "Tool / function calling" section: split into explicit
|
| 12 |
Ollama-path and embedded-jinja-path subsections. The two loader
|
|
|
|
| 7 |
|
| 8 |
## [Unreleased]
|
| 9 |
|
| 10 |
+
### Added
|
| 11 |
+
- Root-level `template`, `system`, and `params` files for HF's Ollama
|
| 12 |
+
bridge. The bridge generates Ollama manifests at request time from
|
| 13 |
+
these three files (NOT from `Modelfile` — confirmed against
|
| 14 |
+
https://huggingface.co/docs/hub/en/ollama). Without them, `ollama
|
| 15 |
+
run hf.co/FoolDev/janus-27b` got an auto-generated manifest with
|
| 16 |
+
the broken `{{ if .Prompt }} .Prompt }}<|im_end|>` template
|
| 17 |
+
(Ollama's faulty Go-template conversion of the GGUF's embedded
|
| 18 |
+
jinja), corrupted stop tokens (`".Prompt }}<|im_end|>"` bleed),
|
| 19 |
+
and no `.Tools` / `.ToolCalls` blocks — so the published Ollama
|
| 20 |
+
tag advertised `completion` only, rejected any request with a
|
| 21 |
+
`tools` array, and was actually broken to load (see "Fixed" below
|
| 22 |
+
re: the projector layer). The three files mirror the `Modelfile`'s
|
| 23 |
+
`TEMPLATE` / `SYSTEM` / `PARAMETER` directives; both routes wire
|
| 24 |
+
tool calling correctly. Edit them together when changing one.
|
| 25 |
+
|
| 26 |
+
### Fixed
|
| 27 |
+
- Renamed `scripts/fetch_mmproj.sh` → `scripts/fetch_vision.sh`. HF's
|
| 28 |
+
Ollama bridge was filename-pattern-matching `mmproj*` anywhere in
|
| 29 |
+
the repo and shipping `scripts/fetch_mmproj.sh` (a 2028-byte bash
|
| 30 |
+
script) as the `application/vnd.ollama.image.projector` layer. When
|
| 31 |
+
Ollama tried to load that "projector" as a GGUF, it failed the
|
| 32 |
+
magic-bytes check and `ollama show` / `ollama run` produced
|
| 33 |
+
`Error: invalid file magic`. Renaming the script breaks the pattern
|
| 34 |
+
match and the projector layer is no longer added to the manifest.
|
| 35 |
+
Updated `Makefile` (`mmproj` target) and README references to point
|
| 36 |
+
at the new name.
|
| 37 |
+
|
| 38 |
### Changed
|
| 39 |
- README "Tool / function calling" section: split into explicit
|
| 40 |
Ollama-path and embedded-jinja-path subsections. The two loader
|
|
@@ -46,7 +46,7 @@ bench: ## Measure tok/s using Ollama's eval timing (3 prompts).
|
|
| 46 |
MODEL=$(MODEL) ./scripts/bench.sh
|
| 47 |
|
| 48 |
mmproj: ## Fetch the vision projector for llama.cpp (Ollama vision is broken upstream).
|
| 49 |
-
./scripts/
|
| 50 |
|
| 51 |
check: ## Lint shell + python files; block dot-pattern footgun.
|
| 52 |
./scripts/check.sh
|
|
|
|
| 46 |
MODEL=$(MODEL) ./scripts/bench.sh
|
| 47 |
|
| 48 |
mmproj: ## Fetch the vision projector for llama.cpp (Ollama vision is broken upstream).
|
| 49 |
+
./scripts/fetch_vision.sh $(PRECISION)
|
| 50 |
|
| 51 |
check: ## Lint shell + python files; block dot-pattern footgun.
|
| 52 |
./scripts/check.sh
|
|
@@ -108,12 +108,13 @@ The 27B is **dense**: every parameter participates in every forward pass. It's s
|
|
| 108 |
| File | Use |
|
| 109 |
|---|---|
|
| 110 |
| `banner.svg` / `banner.png` | Repo header, Tokyo Night themed |
|
| 111 |
-
| `Modelfile` | Ollama wrapper around the
|
|
|
|
| 112 |
| `examples/` | Ready-to-run Python clients for Ollama, Transformers, and llama-cpp-python |
|
| 113 |
| `scripts/build.sh` | One-shot helper: pulls a GGUF and runs `ollama create` for you |
|
| 114 |
| `scripts/smoke_test.sh` | Verifies an Ollama daemon + model, runs a round-trip, and asserts no chat-template tokens leak into the response |
|
| 115 |
| `scripts/bench.sh` | Measures real tok/s using Ollama's `eval_count` / `eval_duration` metadata over a 3-prompt mix (run `make bench`) |
|
| 116 |
-
| `scripts/
|
| 117 |
| `scripts/check.sh` | Local lint: `bash -n`, `pyflakes`, `py_compile`, footgun-grep |
|
| 118 |
| `scripts/install-hooks.sh` | Installs `check.sh` as a git pre-commit hook |
|
| 119 |
| `Makefile` | Convenience wrapper — `make help` lists targets |
|
|
@@ -133,8 +134,9 @@ ollama run hf.co/FoolDev/janus-27b:Q3_K_S # tighter quant
|
|
| 133 |
|
| 134 |
For other quants or local builds, pull from
|
| 135 |
[`unsloth/Qwen3.6-27B-GGUF`](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF)
|
| 136 |
-
and `make build QUANT=...`
|
| 137 |
-
|
|
|
|
| 138 |
|
| 139 |
If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B).
|
| 140 |
|
|
@@ -193,7 +195,7 @@ local app — point it at this repo and pick a quant.
|
|
| 193 |
|
| 194 |
| App | How to load this model |
|
| 195 |
|---|---|
|
| 196 |
-
| **Ollama** | `ollama run hf.co/FoolDev/janus-27b` (or `:Q3_K_S`). Pulls the GGUF +
|
| 197 |
| **LM Studio** | Search → `FoolDev/janus-27b` → pick `Janus-27B.Q4_K_M.gguf` or `Janus-27B.Q3_K_S.gguf`. Uses the GGUF's embedded jinja chat template (Qwen 3.6 ChatML); set the system prompt manually from the `SYSTEM` block in this repo's `Modelfile`. |
|
| 198 |
| **Jan** | Hub → "Import from Hugging Face" → `FoolDev/janus-27b`. Same template behavior as LM Studio. |
|
| 199 |
| **llama.cpp** | `hf download FoolDev/janus-27b Janus-27B.Q4_K_M.gguf --local-dir .` then `llama-server -m Janus-27B.Q4_K_M.gguf` (or `llama-cli`, `llama-mtmd-cli` for vision via the upstream `mmproj-F16.gguf`). |
|
|
@@ -201,10 +203,12 @@ local app — point it at this repo and pick a quant.
|
|
| 201 |
| **Open WebUI / KoboldCpp / text-generation-webui** | Standard llama.cpp loader path — point at the GGUF, use the embedded chat template. |
|
| 202 |
|
| 203 |
For the full Vision (image input) loader matrix, see [Vision](#vision).
|
| 204 |
-
Tool calling currently works in **Ollama** (via
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
the embedded
|
|
|
|
|
|
|
| 208 |
|
| 209 |
### Inference (OpenAI-compatible)
|
| 210 |
|
|
@@ -322,11 +326,18 @@ templates directly (llama.cpp, llama-cpp-python, LM Studio) handle the
|
|
| 322 |
plain-conversation formatting automatically.
|
| 323 |
|
| 324 |
Ollama is the exception: its conversion of the embedded jinja loses the
|
| 325 |
-
`.Tools` / `.ToolCalls` blocks Ollama's capability detector requires
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 330 |
|
| 331 |
#### Plain conversation
|
| 332 |
|
|
|
|
| 108 |
| File | Use |
|
| 109 |
|---|---|
|
| 110 |
| `banner.svg` / `banner.png` | Repo header, Tokyo Night themed |
|
| 111 |
+
| `Modelfile` | Ollama wrapper around the bundled Qwen 3.6 27B GGUF — used by `make build` / `ollama create` for **local** builds |
|
| 112 |
+
| `template`, `system`, `params` | Used by HF's Ollama bridge when users `ollama run hf.co/FoolDev/janus-27b` directly (the bridge does **not** read `Modelfile` — see [HF Ollama docs](https://huggingface.co/docs/hub/en/ollama)). Mirrors the `Modelfile`'s template / system prompt / sampling params. |
|
| 113 |
| `examples/` | Ready-to-run Python clients for Ollama, Transformers, and llama-cpp-python |
|
| 114 |
| `scripts/build.sh` | One-shot helper: pulls a GGUF and runs `ollama create` for you |
|
| 115 |
| `scripts/smoke_test.sh` | Verifies an Ollama daemon + model, runs a round-trip, and asserts no chat-template tokens leak into the response |
|
| 116 |
| `scripts/bench.sh` | Measures real tok/s using Ollama's `eval_count` / `eval_duration` metadata over a 3-prompt mix (run `make bench`) |
|
| 117 |
+
| `scripts/fetch_vision.sh` | Pulls the vision projector (`mmproj-F16.gguf`) for llama.cpp (Ollama vision is broken upstream — see [Vision](#vision)). Renamed from `fetch_mmproj.sh` because HF's Ollama bridge auto-indexed the script as a vision projector layer (filename pattern match). |
|
| 118 |
| `scripts/check.sh` | Local lint: `bash -n`, `pyflakes`, `py_compile`, footgun-grep |
|
| 119 |
| `scripts/install-hooks.sh` | Installs `check.sh` as a git pre-commit hook |
|
| 120 |
| `Makefile` | Convenience wrapper — `make help` lists targets |
|
|
|
|
| 134 |
|
| 135 |
For other quants or local builds, pull from
|
| 136 |
[`unsloth/Qwen3.6-27B-GGUF`](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF)
|
| 137 |
+
and `make build QUANT=...`. The local-build path applies this repo's
|
| 138 |
+
`Modelfile`; the `hf.co/...` path applies the root-level `template`,
|
| 139 |
+
`system`, and `params` files (kept in sync with the `Modelfile`).
|
| 140 |
|
| 141 |
If you want the safetensors for `transformers`, fetch them from [`Qwen/Qwen3.6-27B`](https://huggingface.co/Qwen/Qwen3.6-27B).
|
| 142 |
|
|
|
|
| 195 |
|
| 196 |
| App | How to load this model |
|
| 197 |
|---|---|
|
| 198 |
+
| **Ollama** | `ollama run hf.co/FoolDev/janus-27b` (or `:Q3_K_S`). Pulls the GGUF + the root-level `template` / `system` / `params` files in one step (HF's Ollama bridge ingests these three files; it does **not** read `Modelfile`). For local builds, `make build` uses `Modelfile`, which is kept in sync. |
|
| 199 |
| **LM Studio** | Search → `FoolDev/janus-27b` → pick `Janus-27B.Q4_K_M.gguf` or `Janus-27B.Q3_K_S.gguf`. Uses the GGUF's embedded jinja chat template (Qwen 3.6 ChatML); set the system prompt manually from the `SYSTEM` block in this repo's `Modelfile`. |
|
| 200 |
| **Jan** | Hub → "Import from Hugging Face" → `FoolDev/janus-27b`. Same template behavior as LM Studio. |
|
| 201 |
| **llama.cpp** | `hf download FoolDev/janus-27b Janus-27B.Q4_K_M.gguf --local-dir .` then `llama-server -m Janus-27B.Q4_K_M.gguf` (or `llama-cli`, `llama-mtmd-cli` for vision via the upstream `mmproj-F16.gguf`). |
|
|
|
|
| 203 |
| **Open WebUI / KoboldCpp / text-generation-webui** | Standard llama.cpp loader path — point at the GGUF, use the embedded chat template. |
|
| 204 |
|
| 205 |
For the full Vision (image input) loader matrix, see [Vision](#vision).
|
| 206 |
+
Tool calling currently works in **Ollama** (via the root-level
|
| 207 |
+
`template` file when pulling from `hf.co/...`, or via the `Modelfile`
|
| 208 |
+
TEMPLATE when building locally) and **llama.cpp / llama-cpp-python**
|
| 209 |
+
(via the GGUF's embedded jinja). Other apps' tool-calling support
|
| 210 |
+
depends on whether they read the embedded template or require an
|
| 211 |
+
external schema.
|
| 212 |
|
| 213 |
### Inference (OpenAI-compatible)
|
| 214 |
|
|
|
|
| 326 |
plain-conversation formatting automatically.
|
| 327 |
|
| 328 |
Ollama is the exception: its conversion of the embedded jinja loses the
|
| 329 |
+
`.Tools` / `.ToolCalls` blocks Ollama's capability detector requires.
|
| 330 |
+
Two paths fix this, depending on how you pull the model:
|
| 331 |
+
|
| 332 |
+
- **`ollama run hf.co/FoolDev/janus-27b`** — HF's Ollama bridge applies
|
| 333 |
+
the root-level `template` / `system` / `params` files in this repo
|
| 334 |
+
(the bridge does **not** read `Modelfile`).
|
| 335 |
+
- **`make build` / `ollama create janus-27b -f Modelfile`** — uses the
|
| 336 |
+
`Modelfile`'s `TEMPLATE` block.
|
| 337 |
+
|
| 338 |
+
Both routes wire `.Tools` / `.ToolCalls` and tools work end-to-end on
|
| 339 |
+
`/api/chat` and `/v1/chat/completions`. The two configurations are
|
| 340 |
+
kept in sync: edit them together if you change one.
|
| 341 |
|
| 342 |
#### Plain conversation
|
| 343 |
|
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"temperature": 0.6,
|
| 3 |
+
"top_p": 0.95,
|
| 4 |
+
"top_k": 20,
|
| 5 |
+
"repeat_penalty": 1.05,
|
| 6 |
+
"num_ctx": 16384,
|
| 7 |
+
"stop": [
|
| 8 |
+
"<|im_end|>",
|
| 9 |
+
"<|endoftext|>",
|
| 10 |
+
"<|im_start|>"
|
| 11 |
+
]
|
| 12 |
+
}
|
|
@@ -8,9 +8,9 @@
|
|
| 8 |
# it (see README Vision section, ollama/ollama#15898).
|
| 9 |
#
|
| 10 |
# Usage:
|
| 11 |
-
# ./scripts/
|
| 12 |
-
# ./scripts/
|
| 13 |
-
# ./scripts/
|
| 14 |
#
|
| 15 |
# Requires: huggingface-cli (or hf).
|
| 16 |
set -euo pipefail
|
|
|
|
| 8 |
# it (see README Vision section, ollama/ollama#15898).
|
| 9 |
#
|
| 10 |
# Usage:
|
| 11 |
+
# ./scripts/fetch_vision.sh # default: F16, ~927 MB
|
| 12 |
+
# ./scripts/fetch_vision.sh BF16 # ~931 MB
|
| 13 |
+
# ./scripts/fetch_vision.sh F32 # ~1.8 GB
|
| 14 |
#
|
| 15 |
# Requires: huggingface-cli (or hf).
|
| 16 |
set -euo pipefail
|
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
You are Janus, a precise and capable assistant for reasoning, writing, coding, and long-form dialogue.
|
| 2 |
+
|
| 3 |
+
Behavior rules:
|
| 4 |
+
- Answer the user's actual request directly.
|
| 5 |
+
- Be accurate, complete, and structured.
|
| 6 |
+
- Think before answering, but do not get stuck in repetitive loops or meta-commentary.
|
| 7 |
+
- If the request is ambiguous or incomplete, state what is missing and make the smallest reasonable assumption needed to continue.
|
| 8 |
+
- If the user wants creative writing, preserve tone, continuity, and character consistency.
|
| 9 |
+
- If the user wants analysis or technical help, prefer concrete steps, examples, and decisions over fluff.
|
| 10 |
+
- Finish with a usable answer, not just planning.
|
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{{- $lastUserIdx := -1 -}}
|
| 2 |
+
{{- range $idx, $msg := .Messages -}}
|
| 3 |
+
{{- if eq $msg.Role "user" }}{{ $lastUserIdx = $idx }}{{ end -}}
|
| 4 |
+
{{- end }}
|
| 5 |
+
{{- if or .System .Tools }}<|im_start|>system
|
| 6 |
+
{{ if .System }}{{ .System }}
|
| 7 |
+
|
| 8 |
+
{{ end }}
|
| 9 |
+
{{- if .Tools }}# Tools
|
| 10 |
+
|
| 11 |
+
You may call one or more functions to assist with the user query.
|
| 12 |
+
|
| 13 |
+
You are provided with function signatures within <tools></tools> XML tags:
|
| 14 |
+
<tools>
|
| 15 |
+
{{- range .Tools }}
|
| 16 |
+
{"type": "function", "function": {{ .Function }}}
|
| 17 |
+
{{- end }}
|
| 18 |
+
</tools>
|
| 19 |
+
|
| 20 |
+
For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
|
| 21 |
+
<tool_call>
|
| 22 |
+
{"name": <function-name>, "arguments": <args-json-object>}
|
| 23 |
+
</tool_call>
|
| 24 |
+
{{- end -}}<|im_end|>
|
| 25 |
+
{{ end }}
|
| 26 |
+
{{- range $i, $_ := .Messages }}
|
| 27 |
+
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
|
| 28 |
+
{{- if eq .Role "user" }}<|im_start|>user
|
| 29 |
+
{{ .Content }}<|im_end|>
|
| 30 |
+
{{ else if eq .Role "assistant" }}<|im_start|>assistant
|
| 31 |
+
{{ if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}}
|
| 32 |
+
<think>{{ .Thinking }}</think>
|
| 33 |
+
{{ end -}}
|
| 34 |
+
{{ if .Content }}{{ .Content }}{{ end }}
|
| 35 |
+
{{- if .ToolCalls }}
|
| 36 |
+
{{- range .ToolCalls }}
|
| 37 |
+
<tool_call>
|
| 38 |
+
{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
|
| 39 |
+
</tool_call>
|
| 40 |
+
{{- end }}
|
| 41 |
+
{{- end }}{{ if not $last }}<|im_end|>
|
| 42 |
+
{{ end }}
|
| 43 |
+
{{- else if eq .Role "tool" }}<|im_start|>user
|
| 44 |
+
<tool_response>
|
| 45 |
+
{{ .Content }}
|
| 46 |
+
</tool_response><|im_end|>
|
| 47 |
+
{{ end }}
|
| 48 |
+
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
|
| 49 |
+
<think>
|
| 50 |
+
{{ end }}
|
| 51 |
+
{{- end }}
|