FoolDev commited on
Commit
5cb1604
·
1 Parent(s): 72958b4

Log README tool-calling rewrite + un-stale ollama_chat.py docstring

Browse files

CHANGELOG: log the README restructure (split Tool/function calling into
Ollama-path and embedded-jinja-path subsections; clarified that Ollama
is the only loader needing the Modelfile TEMPLATE override).

examples/ollama_chat.py: docstring step 2 'Edit ../Modelfile so the FROM
line points at the GGUF path' was stale since commit 3d2e907 pointed the
bundled Modelfile at ./Janus-27B.Q4_K_M.gguf. Replaced the four-step
list with two paths (bundled-GGUF or pull-from-HF) covering both flows.

Files changed (2) hide show
  1. CHANGELOG.md +27 -0
  2. examples/ollama_chat.py +14 -6
CHANGELOG.md CHANGED
@@ -7,6 +7,33 @@ and documentation**, not the underlying base model.
7
 
8
  ## [Unreleased]
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ### Added
11
  - `Modelfile` hardware notes: log a measured data point for the
12
  ASUS ROG Flow Z13 GZ302EA (Ryzen AI Max+ 395 / Radeon 8060S iGPU,
 
7
 
8
  ## [Unreleased]
9
 
10
+ ### Changed
11
+ - README "Tool / function calling" section: split into explicit
12
+ Ollama-path and embedded-jinja-path subsections. The two loader
13
+ paths produce different on-the-wire formats and the previous text
14
+ conflated them. The Ollama path (this repo's `Modelfile` TEMPLATE)
15
+ prompts JSON-in-XML — the form Ollama's tool-call extractor parses
16
+ into a structured `tool_calls` array. The embedded-jinja path
17
+ (llama.cpp, llama-cpp-python, LM Studio) reads the Qwen 3.6 native
18
+ chat template baked into the GGUF, which prompts the model to emit
19
+ the verbose `<function=name>` / `<parameter=arg>` form it was trained
20
+ on. Both are valid; the model adapts to whichever shape its system
21
+ prompt prescribes. README now shows both formats with a concrete
22
+ example each so users pick the parser that matches their loader.
23
+ - README "Chat template" intro: clarified that Ollama is the lone
24
+ loader that needs the `Modelfile` `TEMPLATE` override; everything
25
+ else gets correct plain-conversation formatting from the embedded
26
+ jinja directly. Previous wording implied all loaders handled it
27
+ automatically.
28
+
29
+ ### Fixed
30
+ - `examples/ollama_chat.py` docstring: step 2 ("Edit ../Modelfile so
31
+ the FROM line points at the GGUF path") was stale — commit 3d2e907
32
+ pointed the bundled `Modelfile` at `./Janus-27B.Q4_K_M.gguf`, which
33
+ the repo now ships, so the no-edit flow `ollama create janus-27b
34
+ -f ../Modelfile` works out of the box. Replaced the four-step list
35
+ with two paths (bundled-GGUF or pull-from-HF) covering both flows.
36
+
37
  ### Added
38
  - `Modelfile` hardware notes: log a measured data point for the
39
  ASUS ROG Flow Z13 GZ302EA (Ryzen AI Max+ 395 / Radeon 8060S iGPU,
examples/ollama_chat.py CHANGED
@@ -2,12 +2,20 @@
2
  """
3
  Janus-27B — Ollama chat examples.
4
 
5
- Prerequisites:
6
- 1. Pull a Qwen 3.6 27B GGUF (e.g. unsloth/Qwen3.6-27B-GGUF).
7
- 2. Edit ../Modelfile so the FROM line points at the GGUF path.
8
- 3. ollama create janus-27b -f ../Modelfile
9
- 4. ollama serve (usually already running)
10
- 5. python ollama_chat.py
 
 
 
 
 
 
 
 
11
 
12
  The model emits <think>...</think> reasoning blocks before its answer.
13
  Ollama (as of 0.22) does not always split these into a separate field for
 
2
  """
3
  Janus-27B — Ollama chat examples.
4
 
5
+ Prerequisites (pick one):
6
+
7
+ A. From the bundled GGUFs (default flow):
8
+ $ make build # uses Janus-27B.Q4_K_M.gguf
9
+ # or:
10
+ $ ollama create janus-27b -f ../Modelfile
11
+
12
+ B. Pull straight from HF:
13
+ $ ollama run hf.co/FoolDev/janus-27b
14
+ # then set MODEL=hf.co/FoolDev/janus-27b below
15
+
16
+ Then:
17
+ $ ollama serve # usually already running
18
+ $ python ollama_chat.py
19
 
20
  The model emits <think>...</think> reasoning blocks before its answer.
21
  Ollama (as of 0.22) does not always split these into a separate field for