FoolDev Claude Opus 4.7 commited on
Commit
7adf2e4
·
1 Parent(s): a376106

docs(examples): same Ollama-qwen35 split in llama_cpp_vision.py preamble

Browse files

Match the wording the README and Modelfile now use: the arch entries
are in Ollama's Go engine (text works on 0.24+) but missing from the
C++ llama.cpp fallback used when an mmproj is attached, so the
"use llama.cpp directly" recommendation for vision still holds.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (2) hide show
  1. CHANGELOG.md +9 -0
  2. examples/llama_cpp_vision.py +10 -8
CHANGELOG.md CHANGED
@@ -8,6 +8,15 @@ and documentation**, not the underlying base model.
8
  ## [Unreleased]
9
 
10
  ### Fixed
 
 
 
 
 
 
 
 
 
11
  - `examples/ollama_chat.py` simple-chat demo no longer silently swallows
12
  the thinking trace on Ollama 0.24. The demo parsed `<think>...</think>`
13
  blocks out of `message.content` — a workaround dating to Ollama 0.22
 
8
  ## [Unreleased]
9
 
10
  ### Fixed
11
+ - `examples/llama_cpp_vision.py` "Why this script exists" preamble:
12
+ replaced the overbroad "Ollama 0.22's vendored llama.cpp fork is
13
+ missing the qwen35/qwen35moe arch entries" claim with the same
14
+ split the README and Modelfile now use — Go engine has the entries
15
+ (text works on 0.24+), C++ fallback used when mmproj is attached
16
+ still lacks them. Issue link unchanged (#15898 still open). Keeps
17
+ the script's "use llama.cpp directly for vision" recommendation
18
+ intact, just stops misleading users about which Ollama codepath
19
+ is broken.
20
  - `examples/ollama_chat.py` simple-chat demo no longer silently swallows
21
  the thinking trace on Ollama 0.24. The demo parsed `<think>...</think>`
22
  blocks out of `message.content` — a workaround dating to Ollama 0.22
examples/llama_cpp_vision.py CHANGED
@@ -3,16 +3,18 @@
3
  Thanatos-27B — vision (image-text-to-text) via llama-cpp-python.
4
 
5
  Why this script exists:
6
- Ollama 0.22's vendored llama.cpp fork is missing the qwen35/qwen35moe
7
- architecture entries needed to attach a separate mmproj projector.
8
- Both `FROM mmproj.gguf` and `ADAPTER mmproj.gguf` fail with:
 
 
9
  unknown model architecture: 'qwen35moe'
10
- See ollama/ollama#15898, #14730 (closed as duplicates of #15898 root
11
- cause). Until that lands, vision via Ollama is broken for Qwen 3.5 /
12
- 3.6.
13
 
14
- Upstream ggml-org/llama.cpp **does** have the architecture, so vision
15
- works fine via llama.cpp directly. This script uses the python binding.
 
16
 
17
  Install:
18
  pip install llama-cpp-python pillow
 
3
  Thanatos-27B — vision (image-text-to-text) via llama-cpp-python.
4
 
5
  Why this script exists:
6
+ Ollama's Go engine has the qwen35 / qwen35moe arch entries (text
7
+ inference works on 0.24+), but the C++ llama.cpp fallback that
8
+ Ollama switches to when an mmproj is attached still lacks them.
9
+ Both `FROM mmproj.gguf` and `ADAPTER mmproj.gguf` fail at first
10
+ inference with:
11
  unknown model architecture: 'qwen35moe'
12
+ See ollama/ollama#15898 (still open). Until that lands, vision via
13
+ Ollama is broken for Qwen 3.5 / 3.6 while text remains fine.
 
14
 
15
+ Upstream ggml-org/llama.cpp **does** have the architecture across
16
+ both code paths, so vision works fine via llama.cpp directly. This
17
+ script uses the python binding.
18
 
19
  Install:
20
  pip install llama-cpp-python pillow